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Transmitted herewith for filing under 37 CFR 1 .53(b) is: 
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4^ of prior application no.: 07/182,646; filed 15 April 1988. 
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therein, 
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Attorney's Docket No. 5470-1 30DV 


PATENT 


IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re: Application of Frank S. French et al. 

Serial No.: To be assigned 

Filed: Concurrently herewith 

For: ANDROGEN RECEPTOR PROTEINS 

RECOMBINANT DNA MOLECULES 

AND CODING FOR SUCH USE OF 

SUCH COMPOSITIONS 

Date : February 3, 2000 

BOX PATENT APPLICATION 
Assistant Commissioner for Patents 
Washington, DC 20231 

PRELIMINARY AMENDMENT 

Sir: 

Applicants respectfully request entry of the following amendment in the 
above-referenced application. 


In the Claims : 

Please cancel Claims 1-4 for the purposes of rewriting. Please cancel 
Claims 5 and 6, which were prosecuted in the parent application. 


7. An isolated and purified DNA sequence encoding human 
androgen receptor. 


8. The isolated and purified DNA sequence according to claim 7, 
said receptor having the amino acid sequence set forth in Figure 5. 


9. The isolated and purified DNA sequence encoding human 
androgen receptor and as set forth in Figure 5. 


1 0. The isolated and purified DNA sequence according to claim 9, 
said DNA sequence having the nucleotide sequence as set forth in Figure 5. 


In re: Application of Frank S. French et al. 
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11. A human androgen receptor protein encoded by the DNA 
according to any one of claim 7 to 10. 

12. A prokaryotic or eukaryotic host cell transformed or transfected 
with a DNA sequence according to any one of claims 7 to 10. 

1 3. A viral or circular DNA plasmid comprising a DNA sequence 
according to any one of claims 7 to 10. 

14. The viral or circular DNA plasmid according to claim 13 further 
comprising an expression control sequence operatively associated with said 
DNA sequence. 


In the Specification : 

Please make the following amendments to the specification. 

On page 1 of the application, after the title, add the following: 

- -Related Applications 

This application is a divisional of United States Application Serial No. 
07/182,646, filed on April 15, 1988, which is hereby incorporated by reference 
in its entirety.-- 

REMARKS 

The present amendment is submitted to complete the record, to 
present additional claims for substantive examination, and to provide the 
basis for an interference with U.S. Patent No. 5,614,620 to Liao et al., issued 
March 25, 1997. 


In re: Application of Frank S. French et al. 
Serial No.: To be assigned 
Filed: Concurrently herewith 
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Pursuant to 37 CFR 1.607(c), it is noted that claims 7-10 and 12-14 
presented above correspond to Claims 1 and 6-9 in U.S. Patent No. 
5,614,620 to Liao et al. Other claims previously of record, particularly Claim 2 
and Claims 7-14 in the parent application, are substantially the same as those 
submitted in Liao. Claims 2 and 7-14 were of record in the present case prior 
to or within one year of the issuance of the Liao patent. It is noted that the 
parent application was filed within three months subsequent to the filing of the 
Liao patent and, accordingly, falls within the provisions of 37 C.F.R. § 
1 .608(a). 

Applicants respectfully submit that the present application is in 
condition for substantive examination, which action is respectfully requested. 


Myers Bigel Sibley & Sajovec 
PO Box 37428 
Raleigh NC 27627 
Telephone (919) 854-1400 
Facsimile (919) 854-1401 

"Express Mail" mailing label number EL533607857US 
Date of Deposit: February 3, 2000 

I hereby certify that this paper or fee Is being deposited with the United States Postal Service "Express 
Mail Post Office to Addressee" service under 37 CFR 1.10 on the date indicated above and Is 
addressed to Box PATENT APPLICATION, Assistant Commissioner of Patents, Washington DC 


Respectfully submitted. 



Karen A. Magri 
Registration No. 41 ,965 



ANDROGEN RECEPTOR PROTEINS, 'RECOMBINANT DNA MULECULHS CODING 


FOR SUCH, AND USE OF SUCH COMPOSITIONS 


This invention was made in the course of research supported in 
part by grants from the National Institutes of Health (NIH HD 16910, 
HD 04466, and HD 18968). 

TECHNICAL FIELD OF THE INVENTION 

This invention relates to recombinant UNA molecules and their 
expression products. More specifically this invention relates to 
recombinant DNA molecules coding for androgen receptor protein, 
androgen receptor protein, and use of the DNA molecules and protein 
in investigatory, diagnostic and therapeutic applications. 

BACKGROUND OF THE INVENTION 

The naturally occurring androgenic hormones, testosterone and 
its 5 -reduced metabolite, dihydrotestosterone, are synthesized by 
the Leydig cells of the testes and circulate throughout the body 
where they diffuse into cells and bind to the androgen receptor 
protein ("AR"). Androgens, acting through their receptor, stimulate 
development of the male genitalia and accessory sex glands in the 
fetus, virilization and growth in the pubertal male, and maintenance 
of male virility and reproductive function in the adult. The 
androgen receptor, together with other steroid hormone receptors 
constitute a family of trans-acting transcriptional regulatory 
proteins that control gene transcription through interactions with 
specific gene sequences. 

When prostate cancer is found to be confined to the prostate 
gland, the treatment of choice is surgical removal. However, bO to 
80% of prostate cancer patients already have metastases at the time 
of diagnosis. Most of their tumors {70 to 80%) respond to the 
removal of androgen by castration or by suppression of luteinizing 
hormone secretion by the pituitary gland using a gonadotropin 


releasing hormone analogue alone or in combination with an 

anti-androgen. The degree and duration of response to this 

treatment is highly variable (10% live < 6 months, 50% live < 3 

years, and 10% live > 10 years.) Initially cancer cells regress 

without androgen stimulation, but ultimately the growth of androgen 

independent tumor cells continues (3b). At present it is not 

possible to predict on an individual basis which patient will 

respond to hormonal therapy and for how long. If poorly responsive 

patients could be identified early, they could be treated by 

alternative forms of therapy (e.g. chemotherapy) at an earlier stage 

when they might be more likely to respond. 

Studies on androgen receptors in prostate cancer have* suggested 

that a positive correlation may exist between the presence of 

androgen receptors in cancer cells and their dependence on 

androgenic hormone stimulation for growth. (An analogous situation 

exists in mammary carcinoma where there is a correlation between 

estrogen receptors and regression of the tumor in response to 

estrogen withdrawal). However, methoaological problems in the 

measurement of androgen receptors have prevented the routine use of 

androgen receptor assays in the diagnostic evaluation of prostate 

cancer. Prior to our preparation of androgen receptor antibodies, 

all androgen receptor assays were based on the binding of 

C H]-labeled androgen. These assays have been unreliable in human 

prostate cancer tissue because of the extreme lability of the 

androgen binding site and the presence of unlabeled androgen in the 

tissue. Endogenous androgen occupies the binding site on the 

receptor and dissociates very slowly (t 1/2 24-48 hr at OC). A 

further problem is that biopsy samples are quite small, making it 
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difficult to obtain sufficient tissue for [ Hj-androgen binding 
assays. Moreover, prostate cancer is heterogenous with respect to 
cell types. Thus within a single biopsy sample there is likely to 
be an uneven distribution of cells containing androgen receptors. 

Development of the male phenotype and maturation of male 
reproductive function are dependent on the interaction of androgenic 


hormones with the androgen receptor protein and the subsequent 
function of the receptor as a trans-acting inducer of gene 
expression. It has become well established over the past 
twenty-five years that genetic defects of the androgen receptor 
result in a broad spectrum of developmental and functional 
abnormalities ranging from genetic males (46, XY) with female 
phenotype to phenotypically normal males with infertility. 
Isolation of the structural gene for the androgen receptor makes it 
possible to define the nature of these genomic defects in molecular 
terms. Analysis of the functional correlates of the genetic defects 
may lead to a better understanding of the regulation of androgen 
receptor gene expression and of the mechanism of androgen action in 
male sexual development and function. 

The androgen insensiti vi ty syndrome, known also as testicular 
feminization, is characterized by an inability to respond to 
androgen due to a defect in the androgen receptor, the protein that 
mediates the action of androgen within the cell. Androgen 
insensiti vity is an inherited X-linked trait that occurs in both 
complete and incomplete forms. The complete form results in failure 
of male sex differentiation during embryogenesis and absence of 
virilization at puberty. The result is a 46, XY genetic male with 
testes and male Internal ducts. The testes produce normal amounts 
of testosterone and Mullerian inhibiting substance. Consequently 
development of the uterus is inhibited as in the normal male. 
Because of the inability to respond to androgen, the external 
genitalia remain in the female phenotype with normal clitoris and 
labia. A small vagina develops from the urogenital sinus and ends 
in a blind pouch. At puberty feminization with breast development 
and female contours occur in response to testicular estrogen, 
however, there is no growth of sexual hair even though circulating 
testosterone concentrations are equal to or greater than levels in 
the normal male. 

Incomplete forms of the androgen insensiti vity syndrome include 
a spectrum of phenotypes resulting from varying degrees of 
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incomplete androgen responsiveness. At one extreme, individuals 
have mild enlargement of the clitoris and sparse pubic hair. The 
opposite extreme is characterized by more complete mascul inization 
with varying degrees of hypospadias deformity but predominantly a 
5 male phenotype. It has been reported that some adult men with 

severe oligospermia or azoospermia who are otherwise normal, have 
defects in the androgen receptor. These may include as many as 10% 
of infertile males. 

The genetic defect eliciting this range of abnormalities is 

10 thought to be a single biochemical event at the level of the gene 

for the androgen receptor. The androgen receptor is a high affinity 
androgen binding protein that mediates the effects of testosterone 
and dihydrotestosterone by functioning as a trans-acting inducer of 
gene expression. For proper male sexual development to occur, there 

15 is a requirement for androgen and its receptor at a critical time 
during embryogenesi s and during puberty. The majority of 
individuals with the androgen insensiti vi ty syndrome have a history 
of affected family members, although about a third are thought to 
represent new mutations of this X-linked disorder. The incidence 

20 ranges from 1 in 20,000 to 50,000 male births. 

In studies of families with clinical evidence of the androgen 
insensiti vity syndrome, four main categories were recognized that 
range from the most severe, complete absence of receptor binding 
activity in a genetic male with female phenotype, to qualitatively 

25 normal receptor in affected individuals. Second in severity are 

affected individuals with qualitatively abnormal androgen binding by 
receptor present in normal levels. Examples include the failure of 
sodium molybdate (a reagent often used in studies on steroid 
receptors) to stabilize the receptor of affected individuals when 

30 molybdate is known to stabilize the wild-type receptor. Lability of 
the receptor under conditions that normally cause transformation has 
also been reported. A third group expresses a decreased amount of 
receptor with wild-type in vitro binding characteristics. The final 
grouping contains those androgen insensiti vity patients in 


whom no abnormality in receptor is detected. In a recent study of 
this form of the syndrome, the androgen receptor was as capable of 
binding oligonucleotides as the wild-type receptor. Indeed, with 
the techniques available until only recently, it has been difficult 
in certain cases to document an androgen receptor defect in affected 
individuals. 

Experimental methods used in assessing receptor detects in the 
past have relied on the ability of receptor to bind androgen with 
high affinity. The limitation of this methodology is that it is not 
possible to distinguish between the lack of expression of the 
receptor and loss of androgen binding activity. An example of how 
inadequate methodology complicates diagnosis is the absence of 
detectable receptor binding activity in patients who are partially 
virilized. It is theoretically possible for a mutation to occur 
which allows the receptor with defective androgen binding activity 
to induce gene transcription. Biologically active truncated forms 
of the glucocorticoid receptor that lack steroid binding activity 
but retain the DNA binding domain have been demonstrated using 
genetically engineered mutants. 

Purification of the androgen receptor has been difficult to 
accomplish due to its low concentration and high degree of 
instability. Reported attempts at purification using either 
conventional methods of column chromatography or steroid-affinity 
chromatography haveyielded insufficient amounts of receptor protein 
to allow even the preparation of monoclonal antibodies. 

An early report on the partial purification of the androgen 
receptor was disclosed by Mainwaring et al . in "The use of UNA - 
cellulose chromatography and isoelectric focusing for the 
characterization and partial purification of steroid-receptor 
complexes," Biochem J, 134, 113-127 (1973). They used DNA-cellulose 
chromatography and isoelectric focusing to isolate the receptor from 
rat ventral prostate and determined its physiochemical properties. 
This group was among the first to attempt the use of steroid 
affinity chromatography in conjunction with conventional 


chromatography, using the affinity label 
17B-bromoacetoxytestosterone in receptor purification (See 
Mainwaring et al., '*Use of the affinity label 
17B-bromoacetoxytestosterone in the purification of androgen 
receptor proteins," Perspectives in Steroid Receptor Research, 
(1980))- Partial purification of androgen receptor has also been 
attempted from other tissue sources, such as ram seminal vesicles 
(See Foekens et al.. Molecular Cellular Endocr, 173-186 (1981) 
and Foekens et al., "Purification of the androgen receptor of sheep 
seminal vesicles^" Biochem Biophys Res Comm, 104 , 1279-1286 
(1982)), The partially purified receptor displayed characteristics 
of a proteolyzed receptor, but a purification of 2,000 fold was 
reported with a recovery of 33% (See Foekens et al., "Purification 
of the androgen receptor of sheep seminal vesicles," Biochem Biophys 
Res Comm, 104 , 1279-1286 (1982)). Later attempts at purification 
continued to combine steroid affinity chromatography with 
conventional techniques, reportedly achieving significant 
purification, but recoveries too low for further analysis (See Chang 
et aU, "Purification and characterization of androgen receptor from 
steer semenal vesicle," Biochemistry 21, 4102-4109 (1982), Chang et 
al., "Purification and characterization of the androgen receptor 
from rat ventral prostate," Biochemistry 22^, 6170-6175 (1983) and 
Chang et al., "Affinity labeling of the androgen receptor in rat 
prostate cytosol with 

17B-[(bromoacetyl )oxy]-5-alpha-androstan-3-one," Biochemistry 2^, 
2527-2533 (1984)). More recent studies examine the effectiveness of 
a variety of immobilized androgens for their ability to bind the 
androgen receptor (See De Larminat et al., "Synthesis and evaluation 
of immobilized androgens for affinity chromatography in the 
purification of nuclear androgen receptor," The Prostate 5-, 123-140 
(1984) and Bruchovsky et al , "Chemical demonstration of nuclear 
androgen receptor following affinity chromatography with immobilized 
ligands," The Prostate 10, 207-222 (1987)). Despite these efforts, 
the receptor has not been "purified to homogeneity and 


the quantities of purified androgen receptor obtained have been 
insufficient for the production of antisera. 

Clinical assays for the androgen receptor now include several 
methods. The most common is the binding of tritium-labeled hormone 
and measurement of binding using a charcoal adsorption assay. 
Either a natural androgen, such as dihydrotestosterone, or synthetic 
androgen, such as mibolerone or methyl trienol one (R1881), can be 
used. An advantage of the latter in human tissue is that it is not 
significantly metabolized and does not bind to the serum androgen 
binding protein, sex steroid binding globulin. A limitation of 
radioisotope labeling of receptor is interference caused by 
endogenous androgen. Although exchange assays for the androgen 
receptor have been described (See Carroll et at., J Steroid Biochem 
21_, 353-359 (1984) and Traish et al . , J Steroid Biochem 23_, 40b-413 
(1985)), their effectiveness is limited by the slow kinetics of 
dissociation of the endogenous receptor-bound androgen. 

Another method used to assess receptor status is 
autoradiography. In this method disclosed in Barrack et al., 
"Current concepts and approaches to the study of prostate cancer," 
Progress in Clinical and Biological Research, 239 , 156-187 (1987) 
the radioacti vely labeled androgen is incubated with slide-mounted 
tissue sections of small tissue biopsy specimens which are then 
frozen, sectioned and fixed. Nuclear localization of radioactivity 
is detected by exposure of tissue sections to x-ray film. -This 
technique requires considerable technical expertise, is labor 
intensive, and requires extended periods of exposure time. It is 
therefore of limited usefulness in general clinical assays. Another 
problem is high levels of background signal, i.e. a high 
noise/signal ratio, making it difficult to distinguish 
receptor-bound nuclear radioactivity from unbound radioactivity 
distributed throughout the cells. 

WO 87/05049 (Shine) discloses a method for the production of 
purified steroid receptor proteins, specifically estrogen receptor 
proteins, through the expression of recombinant DNA encoding for 


such proteins in eukaryotic host cells. However, the reference does 
not disclose the sequence for androgen receptor protein, nor does it . 
disclose a method for obtaining such a sequence. 

SUMMARY OF THE INVENTION 

The present invention provides a DNA sequence characterized by 
a structural gene coding for a polypeptide having substantially the 
same biological activity as androgen receptor protein. A DNA 
sequence encoding androgen receptor protein or a protein having 
substantially the same biological activity as androgen receptor 
activity is also provided. DNA sequences may be obtained from cDNA 
or genomic DNA, or prepared using DNA synthesis techniques. 

The invention further discloses cloning vehicles comprising a 
DNA sequence comprising a structural gene encoding a polypeptide 
having substantially the same biological activity as androgen 
receptor protein. Cloning vehicles comprising a DNA sequence 
encoding androgen receptor protein or a protein having substantially 
the same biological activity as androgen receptor protein is also 
provided. The cloning vehicles further comprise a promoter sequence 
upstream of and operatively linked to the DNA sequence. In general 
the cloning vehicles will also contain a selectable marker, and, 
depending on the host cell used, may contain such elements as 
regulatory sequences, polyadenyl ati on signals, enhancers and RNA 
splice sites. 

The invention further provides cells transfected or transformed 
to produce androgen receptor protein or a protein having 
substantially the same biological activity as androgen receptor 
protein. 

A further aspect of the invention provides a purified androgen 
receptor protein and purified polypeptides and proteins having 
substantially the same biological activity as androgen receptor 
activity, and methods for producing such proteins and polypeptides. 


BRIEF description! OF THE DRAWINGS 

Figure 1 shows a comparison of DNA-binding domains of the human 
androgen receptorj (hAR) with members of the nuclear receptor 
family. (A) is al comparison of oligo A nucleotide sequence with 
sequences of hAR ^nd other nuclear receptors: hPR, human 
progesterone receiptor; hMR, human mi neralocorticoid receptor; hGR, 
human glucocorticoid receptor; hER, human estrogen receptor; hT3R, 
human thyroid hormone receptor; hRAK, human retinoic acid receptor. 
Chromosomal locations are shown in parentheses at the left. 

Nucleotide identijty between oligo A and hAR is indicated with an 

i 

asterisk. The pejrcent homology with oligo A is in parentheses at 
the right of each| sequence. (B) shows the the structure of 
fibroblast clone jARHFLl human fibroblast clone [1]). Nucleotide 
residues are numbered from the 5' -terminus. Restriction 
endonuclease sitds were determined by mapping or were deduced from 
DNA sequence. Tfje TGA translation termination codon, determined by 
comparison with hPR, hMR and hGR, follows a long open reading frame 
containing sequerjces homologous to those of other steroid 
receptors. Arrow[s indicate exon boundaries in genomic clone X05AK. 
The hatched areajis the putative DNA binding domain. (C) shows a 
comparison of amiino acid sequences of the AR DNA-binding domain with 
sequences of the jnuclear receptor family. AR amino acid sequence 
was deduced f rom Inucleotide sequence of clone AKHFLl and is numbered 
beginning with tl^e first conserved cysteine residue ( + ). Amino acid 
numbers in parentheses at the left indicate the residue number of 
the first conserved cysteine from the references indicated above. 
Percent homology |with hAR is indicated in parentheses on the right. 
The region of the DNA-binding domain from which the oligo A sequence 
was derived is underlined in hAR. Coding DNA of residues 1 to 31 is 

contained wi thi n Igenomic clone XObAR. Abbreviations in addition to 

I 

those described dbove are cVDR, chicken vitamin D receptor, and 
VERBA, erb A protein from avian erythroblastosis virus. 
Abbreviations foif amino acid residues are: 
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A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, He; K, 
Lys; L, Leu; M, Met; N, Asn; P, Pro; Gin; R, Ary; S, Ser; T, Thr; 
V, Val; W, Trp; and Y, Tyr. 

Figure 2 illustrates the steroid binding properties of 
(A) shows the structure of pCMVAR in the 
pCMV containing tne human cytomeyalovi rus(CMV) 


expressed AR cDNA. 
expression vector 


promoter of the immediate early gene» poly(A) addition-transcription 


terminator region 


of the human growth hormone gene (hGH poly A),SV4U 


origin of replication {SV40 Ori), and a polylinker region for 
insertion of cDNAs!. The plasmid pTEBR contains the ampicillin 
resistance gene (/^mp). (B) shows saturation analysis of 
['^Hjdihydrotestosterone binding in extracts of pCMVAR transtection 
of COS M6 cells. Portions of cytosol (0.1 ml, 0.3 mg/ml protein) 
were incubated overnight at with increasing concentrations of 
'^H-labeled hormone and analyzed by charcoal adsorption. 
Nonspecific binding increased from 18% to 37% of total bound 
radioactivity. (C) shows a scratched plot analysis of 
[ H]dihydrotestosterone binding. Error estimation was based on 
linear regressi on janalysi s {r=0.966). (D) illustrates the 
competition of unlabeled steroids for binding of b nM 
[ Hjdihydrotestosterone in transfected COS M6 cell extracts. 
Unlabeled steroids were added at 10- and 100-fold excess of labeled 
hormone. Specific binding was determined as previously described. 

Figure 3 is a compiled clone map of the human androgen- 
receptor. The map shows the structure of the human androgen 
receptor gene and the relative positions of the nucleic acid 
sequences contained in the cDNA probes [AJ, [B], [C] and LOJ, human 
fibroblast clone [1], human epididymis clones [1] and [b], human 
genomic clones [1], [2], [3], [4] and [b], and rat epididymis clones 
[1] and [23. 

Figure 4 is it photograph showing chromosome localization of the 
AR gene on Southefn blots of DNA from human cells containing 
multiple X chromosomes and mouse or hamster cells with X-autosome 
translocation chromosomes. 


Figure 5 shows the complete sinyle strand sequence (b085 bases) 
of the human androgen receptor and the deduced amino acid sequence. 
No intron sequence is included. 

Figure 6 shows the complete single strand sequence (4260 bases) 
of the rat androgen receptor and the deduced amino acid sequence. 

Figure 7 is a photograph of a frozen section of rat ventral 
prostate stained with antibodies {AR-52-3-p) to the AR peptide 
NH^-Asp-Hi s-Val -Leu-Pro-Il e-Asp-Ty r-Tyr-Phe-Pro-Hro-Gl n-Lys-Thr i n 
a dilution of 1 to 3000 using the avidin-biotin peroxidase 
technique. Androgen receptor is indicated by brown staining of 
nuclei in epithelial cells. 

Figure 8 is a photograph showing restriction fragment length 
polymorphisms in the human androgen receptor gene. 

Figure 9 is a photograph showing a Southern blot analysis in 
the human androgen receptor gene in complete androgen insensiti vity 
syndrome patients. 

DETAILED DESCRIPTION OF THE INVENTION 

In the description the following terms are employed: 
Nucleotide 

A monomeric unit of DNA or RNA consisting of a sugar moiety 
(pentose), a phosphate,, and a nitrogenous heterocyclic base. The 
base is linked to the sugar moiety via the glycosidic carbon (1' 
carbon of the pentose) and that combination of base and sugar is a 
nucleoside. The base characterizes the nucleotide. Tne four DNA 
bases are adenine ("A"), guanine ("G"), cytosine ("C") and thymine 
("T"). The four RNA bases are A, G, C and uracil ("U"). A and G 
are purines, abbreviated to R, and C, T, and U are pyrimidines, 
abbreviated to Y. 

DNA Sequence 

A linear series of nucleotides connected one to the other by 
phosphodiester bonds between the 3* and 5' carbons of adjacent 
pentoses. 
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Codon 

A DNA sequence of three nucleotides (a triplet) which encodes 
through mRNA an amino acid, a translational start siynal or a 
translational termination signal. For example, the nucleotide 
triplets TTA, TTG, CTT, CTC, CTA and CTG encode for the amino acia 
leucine ("Leu"), TAG, TAA and TGA are translational stop signals and 
ATG is a translational start signal. 

Reading Frame 

The grouping of codon's during translation of mRNA into amino 
acid sequences. During translation the proper reading frame must be 
maintained. For example, the sequence GCTGGTTGTAAG may be 
translated in three reading frames or phases, each of which affords 
a different amino acid sequence: 

GCT G6T TGT AAG - Al a-Gly-Cys-Ly s 

G CTG GTT GTA AG - Leu-Val-Val 

GC TGG TTG TAA A - Trp-Leu- (STOP) 

Polypeptide 

A linear series of amino acids connected one to the other by 
peptide bonds between the a-amino and carboxy groups of adjacent 
amino acids. 

Genome 

The entire DNA of a substance. It includes inter alia the 
structural genes encoding for the polypeptides of the substance, as 
well as operator, promoter and ribosome binding and interaction 
sequences including sequences such as the Shine-Dalgarno sequences. 

Structural Gene 

A DNA sequence which encodes through its template or messenger 
RNA ("mRNA") a sequence of amino acids characteristic of a specific 
polypeptide. 

Transcription 

The process of producing mRNA from a structural gene. 
Translation 

The process of producing a polypeptide from mRNA. 
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Expression 

The process undergone by a structural gene to produce a 
polypeptide. It is a combination of transcription and translation. 
PI asmid 

5 A non-chromosomal double-stranded DNA sequence comprising an 

intact "replicon" such that the plasmid is replicated in a host 
cell. When the plasmid is placed within a unicellular organism, the 
characteristics of that organism are changed or transformed as a 
result of the DNA of the plasmid. For example, a plasmid carrying 
10 the gene for tetracycline resistance (Tet ) transforms a cell 

previously sensitive to tetracycline into one which is resistant to 
it, A cell transformed by a plasmid is called a "transformant 
Phage or Bacteriophage 

Bacterial virus many of which include DNA sequences 
16 encapsidated in a protein envelope or coat ("capsid"). In a 

unicellular organism a phage may be introduced as free DNA by a 
process called transfection. 

Cloning Vehicle 

A plasmid, phage DNA or other DNA sequences which are able to 
20 replicate in a host cell, characterized by one or a small number of 
endonuclease recognition sites at which such DNA sequences may be 
cut in a determinable fashion without attendant loss of an essential 
biological function of the DNA, e.g., replication, production of 
coat proteins or loss of promoter or binding sites, and which 
25 contain a marker suitable for use in the identification of 

transformed cells, e.g., tetracycline resistance or ampicillin 
resistance. A cloning vehicle is often called a vector. 
Cloning 

The selection and propagation of a single species. 
30 Recombinant DNA Molecule 

A hybrid DNA sequence comprising at least two nucleotide 
sequences, the first sequence not normally being found together in 
nature with the second. 
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Expression Control Sequence 

A DNA sequence of nucleotides that controls and regulates 
expression of structural genes when operatively linked to those 
genes. 

To attain the objects of this invention it was necessary to 
determine the amino acid sequence and the DNA sequence of the 
structural gene encoding androgen receptor protein- One 
conventional approach would involve starting with the puritied 
androgen receptor protein. However, as described above, significant 
amounts of the protein for such purposes have not been obtained. 

An alternative approach to circumvent the overwhelming 
difficulty of androgen receptor protein purification is direct^ 
isolation of the DNA encoding the messenger RNA for androgen 
receptor protein. 

Our strategy for isolating AR DNA was based on evidence that 
the AR gene is X-1 inked and that no other steroid receptor gene is 
located on the X chromosome. Sequence data are available from cDNAs 
for glucocorticoid , estrogen, progesterone, mineralocorticoid and 
vitamin D receptors. Comparison of the derived amino acid sequences 
has revealed a central region of high cysteine content which was 
found also in the v-erb A oncogene product recently identified as 
the thyroid hormone receptor. Within this 61-63 amino acid region 
is an arrangement of 9 cysteine residues that are absolutely 
conserved among steroid receptors thus far characterized. The 
overall homology among sequences in this conserved region ranges 
between 40 and 90%. We assumed that AR would resemble other members 
of the steroid receptor family in the conserved DNA-binding domain. 

A human X chromosomal library was screened with the synthetic 
oligo nucleotide probe A (Oligo A sequence = CTT TTG AAG AAG 
ACC TTA CAG CCC TCA CAG GT^') of Figure 1 (A) designed as a 
consensus sequence from the conserved sequence of the DNA-binding 
domain of other steroid receptors. Screening the library with the 
oligo A probe resulted in several recombinants whose inserts were 
cloned into bacteriophage M13 DNA and sequenced. One recombinant 
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clone (Charon 3b XObAR) (human genomic clone [1]) contained a 
sequence similar to, yet distinct from, the DNA-bindiny domains of 
other steroid receptors. It had 84% sequence identity with oligo A, 
while other receptor DNAs were 78% to 91% homologous with the 
consensus oligonucleotide. 

From the nucleotide sequence just 5* of the DMA binding domain, 
oligonucleotide probe B (Uligo B sequence = ^GGA CCA TGT TTT GCC 
CAT TGA CTA TTA CTT TCC ACC CC^') was synthesized and used to 
screen bacteriophage lambda gtll cDNA libraries from human 
epididymis and cultured human foreskin fibroblasts. Recombinant 
phage (unamplified) screened with this oligonucleotide by in situ 
hybridization revealed one positive clone in each library. The 
epididymal clone (gtll ARHELl ) (human epididymis clone [1]) contained 
the complete DNA-binding domain and approximately l.b kb of upstream 
sequence, whereas the fibroblast clone (gtll ARHFLl ) (human 
fibroblast clone [1]) shown in Figure 1(B) contained the DNA-binding 
domain and 1.5 kb of downstream sequence. The DNA-binding domains 
of the cDNA isolates were identical to that of the genomic exon 
sequence. 

Transient expression in monkey kidney cells (COS Mb) 
demonstrated tnat the human foreskin fibroblast cDNA fragment 
encodes the steroid-binding domain of hAR. A DNA fragment 
(ARHFLIH-X) extending 5* to 3' from the Hind III site within the 
putative DNA-binding domain through the stop codon (TGA) was cloned 
into pCMV as shown in Figure 2(A). Expression was facilitated by 
adding to the 5' end a consensus translation initiation sequence 
containing the methionine codon (ATG) in reading frame. 
Transfection of the recombinant construct produced a protein with 
high-affinity for [ H]dihydrotestosterone, Figure 2(C) saturable 
at physiological levels of hormone. See Figure 2(B). The binding 
constant [K^ = 2.7 (+ 1.4) x lO'-'-'^M) was nearly identical to 
that of native AR. The level of expressed protein, 1.3 pmol per 
milligram of protein, was 20 to 60 times greater than that in male 
reproductive tissues. Mock transf ections without plasmid or 
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transfections with plasmid DNA lacking th AR insert yielded no 
specific binding of di hydrotestosterone. Figure 2{D) shows steroid 
specificity was identical to that of native AR, with highest 
affinity for di hydrotestosterone and testosterone, intermediate 
affinity for progesterone and estradiol, and low affinity for 
Cortisol . 

Figure 3 is a clone map compiled to show the human androgen 
receptor gene and the nucleic acid sequences in the cDNA clones, 
human genomic clones, human fibroblast clones, human epididymis 
clones, and rat epididymis clones. Human fibroblast clone [IJ 
extended through the stop codon or the C-terminal end of the 
androgen receptor protein. To isolate and elucidate the sequence of 
the 5' or N-terminal end of the androgen receptor protein, we used a 
EcoRl/Sstl fragment (EcoRl site was from the linker) from the 5* end 
of human epididymis clone [1] as a probe {cDNA probe [A]), to 
rescreen the human X chromosomal library by standard techniques. By 
these techniques, human genomic clone [2] was isolated and in turn 
used as a probe to rescreen a human epididymis library and isolate 
human epididymis clone [5]. The N-terminal sequence was elucidated 
along with the 5' flanking sequence of the androgen receptor protein 
and gene. Human genomic clones [3], [4] and [5] for the sequence 3* 
of human genomic clone [1] were obtained using cDNA probes B [a Hind 
III/EcoRl fragment] and C [an EcoRl fragment], by screening and 
isolating by standard techniques. 

Two rat clones, rat epididymis clones [1] and [k?], were 
isolated from a rat epididymis cDNA library using as probes the 
complete human epididymis clone [IJ and a EcoRl/Pstl fragment, cUNA 
probe [D], respectively. These rat clones contained the entire 
protein coding sequence for the rat androgen receptor, plus flanking 
b' and 3' untranslated sequences which were used to confirm the 
sequence of the human androgen receptor. 

The complete double-stranded sequence encoding the human 
androgen receptor protein was determined and is set .forth in Figure 
4. The single-stranded DNA sequence encoding human androgen 
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receptor protein along with the amino acid sequence which it codes 
for are set forth in Figure b. The single stranded DNA sequence and 
the amino add sequence for the rat androgen receptor protein is set 
forth in Figure 6 

Recombinant DNA clones human fibroblast clone [IJ isolated from 
human foreskin fibroblast cDNA gtll expression library, human 
epididymis clones [1] and [6] isolated from human epididymis cDNA 
gtll expression library were deposited in the American Type Culture 

Collection with accession numbers ATCC # , , ATCC # and 

ATCC # respectively. Human genomic clones [1]* [k!], [3], [4J 

and [5] which were isolated from human X chromosome lambda Charon 35 
library available as ATCC # 57750 have been deposited with the 
American Type Culture Collection with accession numbers ATCC 

# , ATCC # , ATCC # , ATCC # and ATCC 

# respectively. 

A wide variety of host=cloning vehicle combinations may be 
usefully employed in cloning the double stranded DNA disclosed 
herein. For example, useful cloning vehicles may include 
chroomosomal , non-chromosomal and synthetic DNA sequences such as 
various known bacterial plasmids and wider host range plasmids such 
as pCMV and vectors derived from combinations of plasmids and phage 
DNA such as plasmids which have been modified to employ phage DNA 
expression control sequences. Useful hosts may include bacterial 
hosts, yeasts and other fungi, animal or plant hosts, such as 
Chinese Hamster Ovary cells (CHO, or monkey kidney cells (COS M6), 
and other hosts. The particular selection of host-cloning vehicle 
combinations may be made by those of skill in the art after due 
consideration of factors such as the source of the DNA- i.e. genomic 
or cDNA. 

Cloning vehicles for use'in carrying out the present invention 
will further comprise a promoter operably linked to the DNA sequence 
encoding the androgen receptor protein. In some instances it is 
preferred that cloning vehicles further comprise an origin of 
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replication , as well as sequences which regulate and/or enhance 
expression levels, depending on the host cell selected. 

Techniques for transforming hosts and expressing foreign cloned 
in them are well known in the art {See, for example, Maniatis et 
al., infra). Cloning vehicles used for expressing foreign genes in 
bacterial hosts will generally contain a selectable marker, such as 
a gene for antibiotic resistance, and a promoter which functions in 
the host eel 1 , 

Eukaryotic microorganisms, such as the yeast Saccharomyces 
cerevisiae, may also be used as host cells. Cloning vehicles will 
generally comprise a selectable marker, such as the nutritional 
marker TRP, which allows selection in a host strain carrying a trpt 
mutation. To facilitate purification ot an androgen receptor 
protein produced in a yeast transformant , a yeast gene encoding a 
secreted protein may be joined to the sequence encoding androgen 
receptor protein. 

Higher eukaryotic cells can also serve as host cells in 
carrying out the present invention. Cultured mammalian cells are 
preferred. Cloning vehicles for use in mammalian cells will 
comprise a promoter capable of directing the transcription of a 
foreign gene introduced into a mammalian cell. Also contained in 
the expression vector is a polyadenylation signal, located 
downstream of the insertion site. The polyadenylation signal can be 
that of the cloned androgen receptor gene, or may be derived from a 
heterologous gene. 

A selectable marker, such as a gene that confers a selectable 
phenotype, is generally introduced into the cells along with the 
gene of interest. Preferred selectable markers include genes that 
confer resistance to drugs, such as neomycin, hygromycin and 
methotrexate. Selectable markers may be introduced into the cell on 
a separate plasmid at the same time as the gene of interest, or they 
may be introduced on the same plasmid. 

The copy marker of the integrated gene sequence can be 
increased through amplification by using certain selectable 


markers. Through selection, expression levels may be substantially 
increased. 

Androgen receptor proteins may be purified from the host cells 
or cell media according to the present Invention usiny techniques 
b well known to those in the art. Such proteins may be utilized to 
produce monoclonal or polyclonal antibodies according to the 
techniques described below. 

The techniques of this invention offer considerable advances 
over existing technology for measurement of androgen receptor* 
10 utilizing proteins and peptides containing the disclosed sequences 
monoclonal or polyclonal antibodies can be produced tor use as 
immunochemical reagents in immunodiagnostic assays. For example, 
radioimmunoassays and ELISA assays can be developed utilizing these 
reagents which will allow detection and quantification of androgen 
15 receptor in the presence of endogenous androgen since such androgen 
will not interfere with antibody binding to the receptor. 

Immunocytochemist ry utilizing our reagents enables 
determination and quantification of the cellular distribution of the 
androgen receptor in tumor tissues, which are often heterogenous in 
20 composition. This assay offers great potential for diagnostic 
evaluation of prostate cancer to determine to responsiveness to 
androgen withdrawal therapy. 

In addition, the antibodies produced using the disclosed amino 
acid sequences can also be used in processes for the purification of. 
2b androgen receptor protein produced by the aoove methods. One such 
purification process is disclosed in Logeat, F., et al.. 
Biochemistry vol. 24 (iy8b), pp. 1029-103b, which is incorporated by 
reference herei n. 

Androgen receptor proteins and polypeptides synthesized from 
30 the deduced amino acid sequence can be used as immunogens for the 
preparation of antibodies to the androgen receptor. Peptides for 
such use range in length from about 3 to about 958 amino acids in 
length and are preferrably from about lb to about 30. amino acids In 
length. Shorter peptides may have significant sequence homology to 
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other steroid receptor proteins and larger peptides may contain 
multiple antigenic determinants; these properties could result in 
antibodies with cross-reactivities to other steroid receptor 
proteins. 

Peptides can be synthesized from amino acid sequences in the 
NH^-terminal region, the DNA-binding domain, and the 
carboxyl -termi nal steroid binding domain. Peptide selection will be 
based on hydropathic plots, selecting hydrophilic regions that are 
more likely exposed on the receptor surface. For diagnostic 
purposes preferred sequences will be selected from the 
NH^-terminal region where there is the least homology with other 
steroid receptor proteins. 

Peptides for use as immunogens can be synthesized using 
techniques available to one of ordinary skill in the art. For 
example, peptides corresponding to androgen receptor sequences can 
be synthesized using tBOC chemistry on a Blosearch Model 9bUU 
peptide synthesizer. Peptide purity is assessed by high pressure 
liquid chromatography. Peptides can be conjugated to keyhole limpet 
hemocyanin through cysteine residues using the coupling agent 
m-maleimido-benzoyl-N-hydroxysuccinimlde ester. One can also 
prepare resin-bound peptides utilizing the p-(oxymGthyl benzamide) 
handle to attach the C-terminal amino acid to solid-phase resin 
support. 

Proteins and peptides of this invention can be utilized for the 
production of poyclonal or monoclonal antibodies. Methods for 
production of such antibodies are known to those of ordinary skill 
in the art and may be performed without undue experimentation. One 
method for the production of monoclonal antibodies is described in 
Kohler, G., et al., "Continuous Culture of Fused Cells Secreting 
Antibody of Predefined Specificity," Nature , vol. 25b (iy7b), p. 
49b, which is incorporated herein by reference. Polyclonal 
antibodies, by way of example, can be produced by the method 
described below. 
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Peptide conjugates or resin-bound peptides can be injected into 
rabbits according to the procedure of Vaitukaitis et aU, J Clin 
Endocrinol Metab, _33, 988-991 (1971) using a standard immunization 
schedule. Antisera titers can be determined in the ELISA assay. 

For example, one androgen receptor sequence, 
NH^-Asp-Hi s-Val -Leu-Pro-n e-Asp-Tyr-Tyr-Phe-Pro-Pro-Gl n-Lys-Thr 
in the 5' region upstream from the DNA-binding domain, was used to 
raise antisera in rabbits. The antisera react selectively at a 
dilution of 1 to 500 with the androgen receptor both in its 
untransformed 8-lOS form and in its 4-5S transformed form. Receptor 
sedimentation on sucrose gradients Increases from 4 to B-IOS in the 
presence of antiserum at high ionic strength and from 8-lOS to 
11-12S at low ionic strength sucrose gradients. In the ELISA 
reaction against the peptide used as immunogen, reactivity was 
detectable at 1 to 25,000 dilution. This antiserum at a dilution of 
1 to 3000 was found effective in staining nuclear androgen receptor 
in rat prostate and other male accessory sex glands (see Figure 7). 

Our invention provides new molecular probes comprising 
complementary DNA sequences derived from the deduced sequences 
encoding the androgen receptor for diagnostic purposes. Such probes 
may be used to detect the presence of androgen receptor mRNA in 
tumor cells. Such probes may also be used for detection of androgen 
receptor gene defects. Androgen receptor complementary DNA 
sequences can be used as hybridization probes to detect 
abnormalities in the androgen receptor gene or in its messenger RNA. 

Androgen receptor DNA sequences disclosed and complementary RNA 
sequences can be used to construct probes for use in DNA 
hybridization assays. An example of one such hybridization assay 
and methods for constructing probes for such assays are disclosed 
in U.S. Patent No. 4,683,195 to Mullis et al., U.S. Patent No. 
4,683,202 to Mullis, U.S. Patent No. 4,617,261 to Sheldon, III et 
al., U.S. Patent No. 4,683,194 to Salki et al., and U.S. Patent No. 
4,705,886 to Levenson et al., which are hereby incorporated by 
reference. 
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By example, one method for detecting gene deletion utilizes 
Southern blotting and hybridization. DNA can be isolated from 
cultured skin fibroblasts or from leukocytes obtained from blood. 
DNA is cut with restriction enzymes, electrophoresed on an agarose 

gel, blotted onto nitrocellulose, and hybridized with 

32 

[ P]-labeled androgen receptor DNA (see Maniatis, T. et al.. 
Molecular Cloning, A Laboratory Manual, Cold Spring Harbor 
Laboratory, 1982, incorporated by reference herein). 

In addition, small mutations can be detected utilizing methods 
known to one of ordinary skill in the art, from cultured skin 
fibroblasts of the affected individual. A cDNA library can been 
prepared using standard techniques. The androgen receptor clones 

op 

can be isolated using a [ PJDNA AR probe. The clones AR cDNA can 
then be sequenced and compared to normal AR cDNA sequences. 

Alternatively genomic DNA can be isolated from blood leukocytes 
or cultured skin fibroblasts of the affected individual. The DNA is 
then subjected to restriction enzyme digestion, electrophoresis ana 
is blotted onto nitrocellulose. Synthetic oligonucleotides can be 
used to bracket specific exons. Exon sequences are amplified using 
the polymerase chain reaction, cloned into M13 and sequenced. The 
sequences are compared to normal human AR DNA sequences. 

Another method of identifying small mutations or deletions 
takes advantage of the ability of RNase A to cleave regions of 
single stranded RNA in RNArDNA hybrids. Genomic DNA isolated from 
fibroblasts of affected individuals is hybridized with radioactive 
RNA probes (Promego Biotec) prepared from wild-type amdrogen 
receptor cDNA. Mismatches due to mutations would be cleaved by 
RNase A and result in altered sized bands relative to wild-type on 
denaturing polyacrylamide gels. 

Restriction fragment length polymorphism (RFLP) linked to the 
androgen receptor gene locus may be used in prenatal diagnosis and 
carrier detection of androgen i nsensit i vi ty . For example, the 
presence of RFLPs in normal individuals is first established by 
isolating DNA from lymphocytes of at least six females (total of 12 


-23- 


X chromosomes). DNA can be isolated using the proteinase K 
procedure and fragmented using a battery of restriction enzymes. 
Preferred are those enzymes that contain the dinucleotide sequence 
CG in their recognition sequence. Southern blots are screened with 
5-10 kb androgen receptor genomic fragments which if possible lack 
repetitive DNA. For those regions containing repetitive elements, 
total human genomic DNA can be added as competitor in the 
hybridization reaction. Alternatively, one can subclone selected 
regions to yield a probe free of repetitive elements. 

For example, a human restriction fragment length was determined 
by cDNA probe (B) and Hind III restriction endonuclease using the 
Southern blot technique {See Figure 8). The two RFLP alleles 
detected are a fragment at 6.5 kb (allele) and a fragment at 3.5 kb 
(allele 2). Major constant fragment bands are seen at approximately 
2 and 5 kb with minor constant bands at 0.9 and 7,5 kb. Allele 1 is 
present in approximately 30% of the X chromosomes of the Caucasian 
population. Allele 2 is present in approximately 20% of the X 
chromosomes of the Caucasian population. In Figure 9 Lanes A, B and 
D, DNA from women who are homozygous for allele 1 is shown. In 
Figure 9 Lane C, DNA from a woman who is heterozygous for both 
alleles 1 and 2 is shown. Figure 9 Lane E contains DNA from a man 
that only possesses allele 2. This RFLP, and others determined by 
the clones we have isolated, will enable one to monitor the androgen 
receptor gene in various disease conditions described herein. 

An example of using the androgen receptor clones to detect 
mutations is shown in Figure 9 where five different complete 
androgen insensitive patients' DNA are digested with EcoRl , 
electrophoresed on a Southern blot, and probed with cDNA probe B. 
The patient in lane B lacks a 3kb band indicating that part of the 
androgen receptor gene is deleted. Further analysis of this and 
other patients DNA is possible with other AR probes and by 
sequencing by standard methods and comparing the abnormal sequence 
to the normal sequence described herein. 
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Other potential uses for oligonucleotide sequences disclosed, 
for example in construction of therapeutics to block genetic 
expression, will be obvious to one of ordinary skill in the art. 

5 
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What is claimed is: 

1. A recombinant DNA molecule comprising a DNA sequence 
encoding the structural gene for androgen receptor protein, 

2. The recombinant DNA molecule of Claim 1 wherein the 
androgen receptor protein is a human androgen receptor protein. 

3. A cloning vehicle comprising a genomic DNA molecule which 
upon expression in a eukaryotic host produces androgen receptor 
protein. 

4. The cloning vehicle of Claim 3 wherein the androgen 
receptor protein is a human androgen receptor protein. 

5. An androgen receptor protein produced by translation of 
the DNA sequence encoding androgen recptor protein in a host 
organism transfected or transformed by the cloning vehicle of Claim 
3. 

6. A human androgen receptor protein produced by translation 
of the DNA sequence encoding human androgen receptor protein in a 
host organism transfected or transformed by the cloning vehicle of 
Claim 4. 
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ABSTRACT OF THE DISCLOSURE 

DNA sequences encoding human androgen receptor protein and 
polypeptides and proteins having substantially the same biological 
activity as human androgen receptor protein and the amino acid 
sequences of human androgen receptor protein and polypeptides and 
proteins having substantially the same biological activity as human 
androgen receptor protein are disclosed. Methods for the production 
and use of such compositions are also disclosed. 
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BABCTCTGGA CAAAATTGAG CBCCTATGTG TACATGGCAA GTGTTTTTAG TGTTTGTGTG 
CTCGA6ACCT GTTTTAACTC GCGGATACAC ATBTACCBTT CACAAAAATC ACAAACACAC 

70 SO ■ 90 100 110 ISO 

TTTACCTGCT TGTCTGGGTG ATTTTGCCTT TBAGAGTCTG GATGAGAAAT GCATGGTTAA 
AAATGGACBA ACASACCCAC TAAAACBGAA ACTCTCAGAC CTACTCTTTA CGTACCAATT 

iSO 140 150 . 160 - - 170 IBO 

AGBCAATTCC AGACAGGAAB AAAGGCABAB AAGAGGGTAG AAATGACCTC TBATTCTTGG 
TCCGTTAAGG TCTGTCCTTC TTTCCGTCTC TTCTCCCATC TTTACTGBAG ACTAAGAACC 

190 200 BIO 220 230 240 

GGCTGABGGT TCCTAGABCA AATGGCACAA TGCCACGABG CCCGATCTAT CCCTATGACG 
CCGACTCCCA AGGATCTCGT TTACCGTGTT ACGGTGCTCC GGGCTAGATA GGGATACTGC 

£50 260 270 2B0 290 300 

GAACTCTAAG GTTTCAGCAT CAGCTATCTG CTGGCTTGGT CACTGGCTTB CCTCCTCAGT 
CTTGAGATTC CAAAGTCBTA GTCBATABAC GACCGAACCA GTGACCGAAC GGAGGAGTCA 

31 0 320 330 340 350 360 

TTGTAGGAGA CTCTCCCACT CTCCCATCTG CGCGCTCTTA TCAGTCCTGA AAAGAACCCN 
AACATCCTCT GAGAGGGTGA GAGBGTAGAC GCGCGAGAAT AGTCAGGACT TTTCTTGGGN 

370 380 390 400 410 420 

TGGCNAGCCA 6GAGCNAGGT ATTCNTATCG TCCTTTTCNT CCTCCTNGCC TCACCTNGTT 
ACCBNTCBGT CCTCGNTCCA TAAGNATAGC AGGAAAABNA GGAGGANCGG AGTGGANCAA 

£^30 440 450 460 ^VO 4SO 

BNTTTTTABA TTGGNCTTNG NAACCAAATT TGTATGCTGG CCTCCAGGAA ATCTGGABCC 
^CNAAAAATCT AACCNGAANC NTTGGTTTAA ACATAC6ACC GGAGGTCCTT TAGACCTCGG 

£^90 500 510 5BO 530 540 

TGBCOCCTAA ACCTTGGTTT AGGAAAGCAG GAGCTATTCA GGAAGCAGGG TCCTCCAGGG 
ACCGCGlSATT TBGAACCAAA TCCTTTCGTC CTCGATAAET CCTTCGTCCC AGBAGBTCCC 

550 560 570 5SO 590 600 

CTAGABCTAG CCTCTCCTGC CCTCGCCCAC BTGCGCCAGC ACTTBTTTCT CCAAABCNAC 
GATCTCGATC GGAGABBACG GGAGCGGGTG CACGCGGTCG TGAACAAAGA BSTTTCGNTG 

610 620 630 640 650 660 

TAGGCABGCG TTAGCBCGCG GTGAGGGGAG BBGAGAAAAG GAAAGGGGAB GGGAGGGAAA 
ATCCGTCCGC AATCGCGCGC CACTCCCCTC CCCTCTTTTC CTTTCCCCTC CCCTCCCTTT 

670 680 690 700 . 710 7HO 

AGGAGGTGGB AAGGCAAGGA GGCCGGCCNB 6TGGGGGCGG GACCCGACTC GCANNAACTG 
TCCTCCACCC TTCC6TTCCT CCBGCCBGNC CACCCCCGCC CTGGGCTBAG CGTNNTTGAC 

730 7^0 750 760 770 7SO 

TTGCATTTGC TCTCCACCTC CCf^GCGCCCC CTCCBASATC CCGGGGAGCC AGCTTGCTGG 
AACGTAAACG AGAGBTGGAG GGTCGCGGBG GAGGCTCTAG GBCCCCTCBS TCSAACGACC 

790 800 8 10 B20 830 S40 

GAGAGCGBGA ACGGTCCGGA GCAABCCCAG AGGCAGAGGA GGCGACAGAB GGAAAAAGBG 
CTCTCGCCCT TGCCAGGCCT CGTTCGG6TC TCCGTCTCCT CCGCTGTCTC CCTTTTTCCC 

850 860 870 SSO 890 900 

CCCNAGCTAG CCGCTCCAGT GCTGTACAGN AGCCGAAGGA CGCACCACGC CAGCCCCAGC 
GBGNTCGATC GGCBABGTCA CGACATGTCN TCBGCTTCCT GCGTGGTBCB BTCGGG6TCG 


910 920 930 940 950 960 

CCGGCTCCAG CGACAGCNAA CGCCTCTTGC ANGCGTTCGA AGCCGCCGCC CGGAGCTGCC 
GGCCGAGGTC GCTGTCGNTT GCGGAGAACG TNCGCAAGCT TCGGCGBCGG GCCTCGACGG 

970 980 990 lOOO lOlO 1020 

CTTTCCTCTT CGGTGAAGTT TTTAAAAGCT GCTAAAGACT CGGAGGAAGC AA66AAAGTG 
GAAAGGAGAA GCCACTTCAA AAATTTTCGA CGATTTCT6A GCCTCCTTCG TTCCTTTCAC 

1030 1040 1050 1060 1070 1080 

CCTGGTAGGA CTGACGGCTG CCTTTGTCCT CCTCCTCTCC ACCCCGCCTC CCCCCACCCT 
BGACCATCCT GACTGCCGAC GGAAACAGGA GGAGGAGAGG TGG66CB6AG GGGBBTGGGA 

1090 1100 1110 1120 1130 1140 

GCCTTCCCCC CCTCCCCCGT CTTCTCTCCC BCAGCTBCCT CAGTCGGCTA CTCTCAGCCA 
CGGAAGBGBG GGAGGGGGCA GAAGAGAGGG CGTCGACGGA GTCAGCCGAT GAGAGTCGGT 

1150 1160 1170 1180 1190 ISOO 

ACCCCCCTCA CCACCCTTCT CCCCACCCGC CCCCCCGCCC CCGTCeGCCC AGCGNTGNCA 
TGBGGGGAGT GGTGGGAAGA 6GGGTGGGCG GGGGGGCGGG GGCAGCCGGG TCGCNACNGT 

lElO 1E20 1230 1240 1250 1260 

GNCCGAGTTT GCAGAGAGGT AACTCCCTTT GGCTGCGAGC GGGCGAGNCT AGCTGCACAT 
CNGGCTCAAA CGTCTCTCCA TTBABGGAAA CCGACGCTCG CCCGCTCNGA TCGACGTGTA 

1270 1280 1290 1300 1310 1320 

TGCAAAGAAG GCTCTTAGGA GCAGGCGACT GGGGAGCGGC TTCAGCACTG CAGCCACGAC 
ACGTTTCTTC CGAGAATCCT CGTCCGCTGA CCCCTCGCCG AAGTCGTGAC GTCGGTGCTG 

1330 1340 1350 1360 1370 1380 

CNGCCTGGTT ABGCTGCACG C6GAGAGAAC CCTCTGTTTT CCCCCACTCT CTCTCCACCT 
BNCGGACCAA TCCBACGTGC GCCTCTCTTG BBAGACAAAA GGGGGTGAGA GAGAGGTGGA 

1390 14O0 1410 1420 1430 1440 

CCTCCTGCCT TCCCCACCCC GABTGCGGAG CCAGAGATCA AAAGATGAAA AGGCAGTCAG 
GGAGGACGGA AGGGGTGGGG CTCACGCCTC GGTCTCTAGT TTTCTACTTT TCCGTCAGTC 

1450 1460 1470 1480 1490 1500 

GTCTTCAGTA GCCAAAAAAC AAAACAAACA AAAACAAAAA AGCCGAAATA AAAGAAAAAG 
CAGAAGTCAT CGGTTTTTTB TTTTGTTTGT TTTTGTTTTT TCGGCTTTAT TTTCTTTTTC 

1510 1520 1530 1540 1550 1560 

ATAATAACTC AGTTCTTATT TBCACCTACT TCAGTGGACA CTGAATTTGG AAGGTGGAGG 
TATTATIGAG TCAAGAATAA ACGTGGATGA AGTCACCTGT GACTTAAACC TTCCACCTCC 

1570 15B0 1590 1600 1610 16BO 

ATTTTGTTTT TTTCTTTTAA GATCTGGGCA TCTTTTGAAT CTACCCTTCA AGTATTAAGA 
TAAAACAAAA AAAGAAAATT CTAGACCCGT ABAAAACTTA GATGBGAAGT TCATAATTCT 

1630 1640 1650 1660 1670 16SO 

GACAGACTGT GAGCCTAGCA GGGCAGATCT TGTCCACCGT GTGTCTTCTT CTGCACGAGA 
CTGTCTGACA CTCG6ATCGT CCCGTCTAGA ACAGGTGGCA CACAGAAGAA GACGTGCTCT 

1690 1700 1710 1720 1730 1740 

CTTTGABBCT GTCAGAGC6C TTTTTGCGTG GTTGCTCCCG CAAGTTTCCT TCTCTGGAGC 
GAAACTCCGA CAGTCTCGCG AAAAACGCAC CAACGAGGGC GTTCAAA6GA AGAGACCTCG 

1750 1760 1770 1780 1790 1800 

TTCCC6CAGG TGGGCAGCTA GCTGCAGCGA CTACCGCATC ATCACAGCCT GTTGAACTCT 
AA6QGCGTCC ACCCGTCGAT CBACGTCGCT GATGGCGTAG TAGTBTCBGA CAACTTBAGA 


IBIO IBBO 1830 1840 1850 I860 

TCTGAGCAAG ABAAGGGGAG GCGGG6TAAG GGAAGTAGGT GGAAGATTCA GCCAAGCTCA 
AGACTCGTTC TCTTCCCCTC C6CCCCATTC CCTTCATCCA CCTTCTAAGT C6GTTCBAGT 

1870 1880 1890 1900 1910 1920 

AGGATGGAAG TGCABTTAGG 6CTGGGAAGG GTCTACCCTC GGCCGCCGTC CAA6ACCTAC 
TCCTACCTTC ACGTCAATCC CGACCCTTCC CAGATGGGAG CCBGCGGCAG GTTCTGGATG 

1930 1940 1950 I960 1970 1980 

C6AGGAGCTT TCCAGAATCT GTTCCAGAGC GTGCGCGAAG TGATCCAGAA CCCGGGCCCC 
GCTCCTC6AA AGGTCTTAGA CAAGGTCTCG CACGCGCTTC ACTAGGTCTT GGGCCCGGGG 

1990 eooo HOlo Boeo eo30 eo^o 

AGGCACCCAG AGGCCGCGAG CBCaCCACCT CCCGGCGCCA GTTTGCTGCT GCT6CA6CAG 
TCCGTGGGTC TCCGGCGCTC GCGTC^TGGA GGGCCGCGGT CAAACGACGA CGACGTCGTC 

£050 e060 B070 S080 £090 £100 

CAGCAGCAGC AGCAGCAGCA GCAGCAGCAG CAGCAGCAGC AGCAGCABCA GCABCAGCAG 
GTCGTCGTCB TCGTCGTCGT CGTCGTCGTC GTCGTCBTCB TCGTCGTCGT CGTCGTCGTC 

£110 £1£0 £130 £140 £150 £160 

CAGCAGCAAG AGACTAGCCC CAGGCAGCAG CAGCAGCAGC AGGGTBAGGA TGGTTCTCCC 
GTCGTCfi'TTC TCTGATCGGG GTCC6TCGTC GTCGTCGTCG TCCCACTCCT ACCAAGAGGS 

£170 £180 £190 ££00 E£iO £££0 

CAAGC;:CATC GTAGAGGCCC CACAGGCTAC CTGGTCCTGG ATGAGGAACA GCAACCTTCA 
GTTCGGGTAG CATCTCCG5G GTGTCCGATB GACCAGGACC TACTCCTT6T CGTTGGAAG t 

££30 ££40 ££50 ££60 ££70 ££80 

CAGCCBCAGT CG6CCCTGGA GTBCCACCCC GAGAGAGGTT GCQTCCCAGA GCCTBGAGCC 
"^'gTCGGCGTCA GCCGGGACCT CACGBTGGGG CTCTCTCCAA CGCAGGGTCT CGGACCTCSb' 

££90 £300 £310 £3£0 £330 _ £340 

GCCSTGSCCG CCAGCAAGGG GCTGCCGCAG CAGCTGCCAB CACCTCCGGA CGAGGATGAC 
CLVr-CACC.»3L-C GGTCBTTCCC CGACGGCGTC GTCGACGGTC BTBGA5GCCT BCTCCTACTG 

£35'j £360 £370 £380 . £390 P'+OO 

7CABCT5rCC CATCCACGTT GTCCCTGCTB GBCCCCACTT TCCCCGGCTT AAGCAGCTGC 
AGTCGACGBG GTAGGTGCAA CAGGGACGAC CCGGGBTGAA A6GGGCCGAA TTCGTCBACG 

£ttiO £4£0 £430 £440 £450 £460 

TCCBCTGACC TTAAAGACAT CCTGAGCGAB GCCAGCACCA TGCAACTCCT TCAGCAACAG 
ABGCGALTG3 AATTTCTGTA GBACTCGCTC CBBTCGTGBT ACGTT6AGGA AGTCGTTGTC 

£470 £4B0 £490 £500 £510 £5£0 

CAGC-iGGAAG CAGTATCCGA ABGCAGCABC ABCGBGAGAG CGAGGGAGGC CTCGGGGGCT 
GTCGTCCTTC GTCATAGGCT TCCGTCGTCG TCBCCCTCTC GCTCCCTCCG GAGCCCCCGA 

£530 £540 £550 £560 E570 £5G0 

CCCACTTCCT CCAABGACAA TTACTTAGG6 GGCACTTCGA CCATTTCTGA CAACGCCAAG 
GGGTGAAGGA GGTTCCTGTT AATBAATCCC CCGTGAABCT GGTAAABACT GTTBCGGTTC 

£590 £600 £610 £6£0 £630 £640 

GAGTTGTBTA ABGCAGT6TC BGT6TCCATG GGCCTGGGTG TGGAGGCBTT GGAGCATCTG 
CTCAACACAT TCCGTCACAG CCACAGGTAC CCGGACCCAC ACCTCCGCAA CCTCGTAGAC 

£650 £660 £670 £680 £690 £700 

AGTCCAGGGG AACAGCTTCG GGGGGATTGC ATGTAC6CCC CACTTTTGGG AGTTCCACCC 
TCAG'3TCCCC TTGTCGAAGC CCCCCTAACG TACATGCGGG BTGAAAACCC TCAAG6TGBG 


273 0 8720 E730 B7^0 £750 B76>0 /r:^r''-^,^L^^: -^^^^^ 

GCT6TB9CTC CCACTCCTTB TGCCCCATTG GCCGAATGCA AAGGTTCTCT GCTAGACGAC " T-V'-'^L' ci" r^^" 
CEACACCGAG 6GTGAGGAAC ACGG66TAAC CGGCTTACGT TTCCAAGAGA CGATCTC3CTG ^Ctir^Ji- - 

a-77v B7BO B790 £800 EBIO 28H0.^<v:i--r.^v--^g.^-i^^^ 

AGCf?CAGGCA AGAGCACTGA AGATACTGCT GAGTATTCCC CTTTCAAGGG AGGTTACACC ^ ;; r;;^ ' i^gst;^ 
TCGCGTCCGT TCTCBTGACT 7CTATGACGA CTCATAABGG GAAAGTTCCC TCCAATBTBQ -^^^ijr' 

EB^M P850 PB60 SS70 :-aBBO:?5^-.''"-')rf^^^^ 

AAAGGGCTAG AAGGCGAGAG CCTA6BCTGC TCTGGCAGCG CTGCAGCABB GAGCTCCQBa^V'.^^^ti*-^^?:; 
mCCCBATC TTCCGCTCTC BGATCCGAC6 AGACCGTCGC GACGTCGTCC CTCQAGGC(^;^. ^^^^^ 

BB^O r?oOo 2910 £920 B930 

ArArTTG^*^>C TGCCGTCTAC CCTGTCTCTC TACAAGTCCG GABCACTGGA CGAGBCAGCTijiji,: > 
TGIt^AACTTG ACGGCAGAIG GGACAGAGAG ATGTTCAGGC CTCBTBACCT GCTCCBTCSA"^"*"' " ' 

>l.< 

F9-'0 2960 2970 89BO £990 - 3000^-/- 

nrCTACTAGA GTCGCGACTA CTACAACTTT CCACTGGCTC TGGCCGGACC GCCBCCCCCTv^.-. --f^ruii -.:^'Vvh 
CGL'^.TGGTCT CAGCGCTGAT GATGTTGAAA GGTGACCGAG ACCGGCCTGG CGGCGGBBGA : .i-'-K' 

30 lO 30 2 O 3030 3Oi»0 3050 3060''. ' 

CCGCCGCCTC CCCATCCCCA CGCTCGCATC AAGCTGGAGA ACCCGCTGGA CTACGGCABC -• VXT:^ 
GGCGGCBGAG GGGTAGGGBT GCGAGCGTAG TTCGACCTCT TGGGCGACCT GATBCCGTCa ^ ■ •XQ.tH 



3070 30PO 309O ^loo 3110 3iao,,;_.^ ^ V. 

GCnr-^L-GrGG CTt^Xpf^CGGC t^A^TGCCGC *T^TGGGGACC TGGCGABCCT GCATGGCGCG * \: '''-.^^ 
CGGv!:iC!:".rf3CC Gr,fjgpCGCCG G>SfU.ACGGQfr A?"ACCCCT GG ACCGCTCGGA CGTACC6CGC ' '-^y- 

3130 31^0 3150 3160 3170 3180-';.. Vfi 

rr-li-^C.-x^CGG GACCCGGTTC TGGBTCACCC TCAGCCGCCG CTTCCTCATC CTBGCACACT , -.^ - . ^^-^r 

^-:cAf r.TrGcc ctgggccaag acccagtggg agtcggcggc gaaggagjag baccgtbtga** ' ' ^-tf* 

?i90 3200 321 0 3220 3230 3S40 ' '"'V • 

CTCTTiT.nCAG CCGAAGAAGG CCAGTTGTAT GGACCGTGTG GTGGTGGTGG GBGTBGTGBC 

GAC-AAGTGTC GGCTTCTTCC GGTCAACATA CCTGGCACAC CACCACCACC CCCACCACCB ' '*'r.. ^'' '^ 

3?50 3260 3270 32B0 3E90 3300 3 ' • 

GGCGf^C'^GCB GCGGCGGCGG CGGCGGCGGC 6GCBGCGGCB GCGGCGGCGG CGAGGCGGGA 
Cl:Grc.G:""CGC CGCCGCCGCC GCCGCCGCCG CCGCCGCCGC CGCCGCCBCC GCTCCBCCCT *. - 

3310 3320 3330 33AO 3350 3360 , 

Gr-TRTftGCCC CriACGGCTA CACTCGGCCC CCTCAGGGGC TGGCGGGCCA GGAAAGCGAC. Vl'!" 
CGArAlCGGG GGATGCCGAT GTGAGCCGGG GGABTCCCCG ACC6CCCGGT CCTTTCBCTG ' - . V '"'5.;;' 

337U 33BO 3390 3^00 3^10 S^fSO .* ■ [r.'i 

T7CArcGCAC CTGA7GTGTG GTACCCTGGC GGCATGGTGA GCAGAGTGCC a:TATCCCAGT :. = ^' 
AAG"TGr-CGlG GACTACAI.AC CATGGGACCG CCGTACCACT CGTCTCACBG GATABGGTCA " - • ' 

3^30 3m^O 3^50 3^60 3^70 3^BO'' 

Cf CA'-TTGTG TCA^AAGCGA AATGGGCCCC TGGATGGATA GCTACTCCCG GGAACCTTAC ' • -'" 

GBGTL-AACAC AGTTTTCGCT TTACCCGGGG ACCTACCTAT CBATGAGGGC CCTTBGAATG • „ -* * 

3^90 3500 351 0 3520 3530 35^0 " ' t' -^'^ 

GGGGArATBC GTTTGGAGAC TGCCAGGGAC CATGTTTTGC CCATTGACTA TTACTTTCCA ::' , J\ 

CCCr:7C-.7 ACG CAAACCTCTG .ACGGTCCCTG GTACAAAAC6 GGTAACTGAT AATBAAAGGT 

35^.'! 35Ar) 3ri70 3580 3590 3600 

CCr'_AF.^'i'"C-"* CCTGCCTGAI CTGTGGAGAT GAAGCITCTG GGTGTCACTA TGGAGCTCTD 
GGGG"^CTTCT GBACGGACTA GACALCTCTA CTTCGAAGAC CCACAGTGAT ACCTCGAGAG 


3610 3630 3640 3650 3660 

• ACATGTBGAA GCTBCAAGGT CTTCTTCAAA AGABCCSCTG AABSBAAACA GAABTACCTS 
TGTACACCTT CGACGTTCCA GAAGAAGTTT TCTCGGCGAC TTCCCTTTGT CTTCATGGAC 

3670 3680 3690 3700 3710 37EO 

TBCeCCAGCA GAAATBATTB CACTATTGAT AAATTCCGAA BGAAAAATTB TCCATCTTST 
ACGCGGTCGT CTTTACTAAC GTGATAACTA TTTAAGGCTT CCTTTTTAAC AGBTAGAACA 

3730 3740 3750 3760 3770 37SO 

CGTCTTC6GA AATGTTATGA AGCAGGGATG ACTCTBGGAG CCCGGAAGCT GAAGAAACTT 
GCAGAAGCCT TTACAATACT TCGTCCCTAC TGAGACCCTC GGGCCTTCGA CTTCTTTGAA 

3790 3BOO 3S10 SBBO * 3830 3840 

BGTAATCTGA AACTACAGBA GGAAGGAGAB GCTTCCABCA CCACCAGCCC CACTSABGAB 
CCATTAGACT TTGATBTCCT CCTTCCTCTC CBAABGTCGT GBTGGTCGGG GTGACTCCTC 

3850 3860 3870 3880 3890 3900 

ACAACCCAGA AGCTGACAGT GTCACACATT GAAGGCTATG AATGTPAGCC CATCTTTCTG 
TGTTGGGTCT TCGACTGTCA CAGTGTGTAA CTTCCBATAC TTACABTCBB GTAGAAABAC 

3910 39aO 3930 3940 3950 3960 

AATGTCCTGG AAGCCATT6A GCCAGGTGTA GTGTGTGCTG GACACGACAA CAACCAGCCC 
TTACAGGACC TTCGGTAACT CGGTCCACAT CACACACGAC CTGTGCTGTT GTTGGTCGGG 

3970 3980 3990 4000 40 lO 40aO 

6ACTCCTTTG CAGCCTTGCT CTCTAGCCTC AATGAACTGG GAGAGAGACA GCTTGTACAC 
CTGAGGAAAC GTCGGAACGA GAGATCGGAG TTACTTGACC CTCTCTCTGT CGAACATGTG 

4030 4040 ■ 4050 4060 4070 4080 

GTGGTCAAGT GGGCCAAGGG CTTGCCTGGC TTCCGCAACT TACACGTGBA CGACCABATG 
-r^ CACCAGTTCA CCCGGTTCCC GAACGGACCG AAGGCGTTGA ATGTGCACCT GCTGGTCTAC 

4090 4100 4110 41B0 4130 4140 

GCTGTCATTC AGTACTCCTG GATGGGGCTC ATGGTGTTTG CCATGGGCTB GCGATCCTTC 
CGACAGTAAG TCATGAGGAC CTACCCCGAG TACCACAAAC GGTACCCGAC CGCTAGGAAS 

4150 4160 4170 4180 4190 4800 

ACCAATGTCA ACTCCAGGAT GCTCTACTTC GCCCCTGATC TGGTTTTCAA TGAGTACCGC 
TGGTTACAGT TGAB6TCCTA CGAGATGAAG CGGBGACTAG ACCAAAAGTT ACTCATGGCG 

48 lO 4E80 4230 4840 4250 486^ 

ATGCACAAGT CCCGGATGTA CAGCCAGTGT GTCCGAATGA GGCACCTCTC TCAAGAGTTT 
TACGTGTTCA GGGCCTACAT GTCGGTCACA CAGGCTTACT CCGTGBAGAG AGTTCTCAAA 

4870 4880 4890 4300 431 O 43SO 

GEATG6CTCC AAATCACCCC CCAGGAATTC CTGTGCATGA AAGCACTGCT ACTCTTCAGC 
CCTACCGABG TTTAGTGBGG GGTCCTTAAG GACACGTACT TTCGTGACGA TGAGAAGTCG 

4330 4340 4350 4360 4370 4380 

ATTATTCCAG TGGATGGGCT GAAAAATCAA AAATTCTTTG ATGAACTTCG AATGAACTAC 
TAATAAGGTC ACCTACCCGA CTTTTTAGTT TTTAA6AAAC TACTTGAAGC TTACTTGATG 

4390 4400 44 lO 4480 4430 4440 

ATCAAGGAAC TCGATCBTAT CATTGCATGC AAAAGAAAAA ATCCCACATC CTGCTCAAGA 
TAGTTCCTTG AGCTAGCATA GTAACGTACG TTTTCTTTTT TAGGGTGTAG GACGAGTTCT 

4450 4460 4470 4480 4490 4500 

CGCTTCTACC AGCTCACCAA BCTCCTGGAC TCCGTGCAGC CTATTSCSAG AGABCTBCAT 
GCGAAGATGG TCGAGTGGTT CGAGGACCTG AGGCACGTCG GATAACGCTC TCTCBACGTA 


^51 r> 45EO ^530 ^5^0 ^550 4560 

CAGTTCACTT TTGACCTBCT AATCAABTCA CACATGGTGA GCGTGGACTT TCC6GAAATG 
6TCAAGTGAA AACTGGACGA TTA6TTCAGT GTGTACCACT CGCACCTGAA AGGCCTTTAC 

£j570 4580 4590 4600 4610 4620 

ATGGCAGAGA TCATCTCTGT GCAAGTGCCC AAGATCCTTT CTGGGAAAGT CAAGCCCATC 
TACCGTCTCT AGTAGAGACA CGTTCACGBG TTCTABGAAA GACCCTTTCA GTTCGGGTAG 

£,^30 ^640 4650 4660 4670 4680 

TATTTCCACA CCCAGTGAA6 CATTGGAAAC CCTATTTCCC CACCCCAGCT CATGCCCCCT 
ATAAAGG7GT SGBTCACTTC GTAACCTTTG G6ATAAAGGG GTGGGGTCGA GTACGGGGGA 

4690 4700 4710 4720 4730 4740 

TTCAGATGTC TTCTGCCTGT TATAACTCTG CACTACTCCT CTGCABTGCC TTBGBGAATT 
AAGTCTACAG AAGACGGACA ATATTGAGAC GTGATGAGGA BACGTCACGG AACCCCTTAA 

A750 4760 4770 4780 4790 4800 

TCCTCTATTG ATGTACAGTC TGTCAT6AAC ATGTTCCTGA ATTCTATTTG CTGGGCTTTT 
AGGAGATAAC TACATGTCAG ACABTACTTG TACAAGGACT TAAGATAAAC GACCCGAAAA 

4810 4820 4B30 4S/f0 4850 4860 

TTTTTCTCTT TCTCTCCTTT CTTTTTCTTC TTCCCTCCCT ATCTAACCCT CCCATBGCAC 
AAAAAGAGAA AGABAGGAAA GAAAAAGAAG AAGGGA6GGA TAGATTGGGA GGGTACCGTG 

4870 4880 4890 4900 4910 49B0 

CTTCAGACTT TGCTTCCCAT TGTGGCTCCT ATCTGTGTTT TGAATGGTGT TGTATQCCTT 
GAABTCTGAA ACGAAGBGTA ACACCGAGGA TAGACACAAA ACTTACCACA ACATACBGAA 

4930 4940 4950 4960 4970 4980 

TAAATCTGTG ATGATCCTCA TATGGCCCAG TBTCAAGTTG TGCTTGTTTA CAGCACTACT 
ATTTAGACAC TACTAGGABT ATACCGGGTC ACAGTTCAAC ACGAACAAAT GTCGTGATGA 

4990 50O0 50 10 5080 5030 5040 

CTBTGCCAGC CACACAAACG TTTACTTATC TTATBCCACG GGAAGTTTAG AGAGCTAAGA 
GACACGGTCG GTGTGTTTGC AAATGAATAB AATACB6TGC CCTTCAAATC TCTCGATTCT 

5050 5060 5070 5080 

TTATCTGGGG AAATCAAAAC AAAAAACAAG CAAACAAAAA AAAAA 
AATAGACCCC TTTAGTTTTG TTTTTTGTTC GTTTGTTTTT TTTTT 
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lO EC 30 40 50 60 

GA3CTCTGSA CAAAATTGAG CGCCTAT6TG TACATBSCAA BTGTTTTTAG T6TTTGTGTB 

70 BO 90 lOO no 120 

TTTACCTSCT TGTCTGGGTB ATTTTGCCTT TGAGAGTCTG GATGAGAAAT GCATGGTTAA 

130 140 150 160 170 ISO 

AGGCAATTCC ABACAGGAAB AAAGGCAGAG AAGAGGGTAG AAATGACCTC TGATTCTTGG 

190 aoo eio eao eso 240 

GGCTGAGGGT TCCTAGAGCA AATGGCACAA TGCCACGAGG CCCGATCTAT CCCTATGACG 

250 S60 E70 SSO 290 30O 

GAACTCTAAG GTTTCAGCAT CAGCTATCTG CTGGCTTGGT CACTGGCTTG CCTCCTCAGT 

310 320 330 340 350 360 

TTGTAGGAGA CTCTCCCACT CTCCCATCTB CGCGCTCTTA TCAGTCCTGA AAAGAACCCN 

370 380 390 400 410 420 

TGGCNAGCCA GGA6CNAGGT ATTCNTATCB TCCTTTTCNT CCTCCTNGCC TCACCTNGTT 

430 440 450 460 470 4SO 

GNTTTTTAGA TTEBNCTTNG NAACCAAATT TGTATGCTGG CCTCCAGGAA ATCTGGAGCC 

490 500 510 520 530 * 540 

TGGCGCCTAA ACCTTGGTTT AGGAAAGCAG GAGCTATTCA GGAAGCAGGG TCCTCCAGGG 

550 560 570 SSO 590 600 

CTAGAGCTAG CCTCTCCTGC CCTCGCCCAC GTGC6CCA6C ACTTGTTTCT CCAAABCNAC 

610 620 630 640 650 660 

'^TAGBCAGGCG TTAGCGCGCG GTGAGGGGAG GGBAGAAAAG GAAAGGGGAG GG^AGGGAAA 

670 680 690 700 710 720 

AGGAGBTGBG AABGCAABGA GGCCBGCCNG GTGGGGGCGB GACCCGACTC GCANNAACT6 

730 740 750 760 770 780 

TTGCATTTGC TCTCCACCTC CCAGCGCCCC CTCCGAGATC CCGGGGAGCC AGCTTGCTGG 

790 800 810 820 830 840 

GAGAGCGGGA AC8GTCCGGA GCAAGCCCAG AGGCAGAGBA GGCGACAGAG GGAAAAAGGG 

B50 860 870 BSO 890 900 

CCCNAGCTAG CCGCTCCAGT GCTGTACAGN AGCCGAAGGA CGCACCACGC CAGCCCCAGC 

91 0 920 930 940 950 960 

CCGGCTCCAG CGACAGCNAA CGCCTCTTGC ANGCGTTCGA AGCCGCCGCC CGGAGCTGCC 

970 980 990 1000 1010 1020 

CTTTCCTCTT CGGTGAAGTT TTTAAAASCT GCTAAAGACT CGGAGGAAGC AAGGAAAGTG 

1030 1040 i050 1060 1070 lOBO 

CCTGGTAGGA CTGACGGCTG CCTTTGTCCT CCTCCTCTCC ACCCCGCCTC CCCCCACCCT 

1090 1100 1110 llEO 1130 1140 

GCCTTCCCCC CCTCCCCCBT CTTCTCTCCC GCABCTGCCT CAGTCGGCTA CTCTCAGCCA 

1150 1160 1170 1180 1190 1200 

ACCCCCCTCA CCACCCTTCT CCCCACCCGC CCCCCCBCCC CCGTCGGCCC AGCGNTBNCA 

1210 . . 1220 ^ 1230 1240 1250 . - 1260 

GNCCGAGTTT GCAGAGABGT^ AAC GGGCBAGNCT AGCTGCACAT _ 

r , .Lr-^x^.l<i^>^^r^ _:?-,'C^-r;x^'jKj"tri.'-;^~-- '.-^.-^'=f2L.:^^\-is:Li^T^^'^,. .-s r^.-iJtsSc^jft'ji^xZt'sdS-i^^ 


1 £70 1380 • 1300 

TtCnAAGAAG GCTCTTAGGA GCAGGCGACT GEGGAGCG6C 

1 330 1 3^ O 1 350 1 360 

CNBCCTGGTT AGGCTGCACG C6GAGAGAAC CCTCTGTTTT 

1390 1^00 1410 i^ao 

CCTCCTGCCT TCCCCACCCC GAGTGCGGAG CCAGAGATCA 

i^-'O 1^60 ' 3i^»7o . a^eo 

GTCTTCAGTA GCCAAAAAAC AAAACAAACA *AAAACAAAAA 

-''^'lO 15E0 1530 15*^0 

A7AATAACTC AGTTCTTATT TGCACCTACT TCAGTGGACA 

^ -'70 1 580 1590 1600 

ATTTTGTTTT TTTCTTTTAA GATCTGGGCA TCTTTTGAAT 

\<:.30 1640 1650 1660 

GACAGACTGT GAGCCTAGCA GGGCABATCT TGTCCACCGT 

3 c-90 1 700 1710 1 7PO 

CTTTGAGBCT GTCABAGCGC TTTTTGCG7G GTTBCTCCCG 

^'^'-■O 1760 1770 1730 

TTC-CCGCAGG TGGGCAGCTA GCTGCAGCGA CTACCGCATC 

a B 1 o 1 eao 1 830 1 e^o 

TCTr.AGCAAG AGAA5GGGAG BCGGGGTAAG G6AABTABG7 

I S90 

t^r-u AT'-r. C-AA GTG CAG TTA GGG CTG EGA AGG GTC 
'Et Glu Gin Lecf Gly Leu Gly Arc; Val 

3 950 

CP.^ irL.A GUT TTC CAG AAT CTG TTC CAB AGC GTG 
Hrg i^ly fMEr. PhET- Gin ABn Leu Phe Gin S&r Vai 

20 10 

AG'_. CAC -CA GAG GCC GCG AGC GCACo> CCT CCC 
Aro HiE Pro G3 u Ala Ala Ser A3 a ^|<at Pro Pro 

BO 70 

CA-3 CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG 
G]n Gin Gin G}n Gin Gin Gin Gin Gin Gin Gin 

2130 

CAG CAG CAA GAG ACT AGC CCC AGG CAB CAG CAG 
Gin GJn Gin Gin Thr 5er Pro Arg Gin Gin Gin 

a 190 

CAA GCC CAT CGT AGA GGC CCC ACA GGC TAC CTG 
Gin Ala Hie Arg Arg Gly Pro Thr Gly Tyr Leu 

eE50 

CAC- CCG CAG TCB GCC CTG GAG TGC CAC CCC GAG 
3 3ri Pro Gin Ser Ala Leu Glu Cys Hie Pro Glu 

H310 

GCC GTG GCC GCC ABC AAG GGG CTG CCG CAG CAG 
Ala, Val Ala Ala. Ser Lys Gly Leu Pro Gin Gin 


J 3 1 C) 1 3EO 

TTCAGCACTG CAGCCAC6AC 

1370 1380 
CCCCCACTCT CTCTCCACCT 

1^30 14^0 
AAAGATGAAA A6GCAGTCAG 

1^90 1500 
AGCCGAAATA AAAGAAAAAG 

1550 1560 
CTGAATTTBG AAGGTGGAGG 

3610 1680 
CTACCCTTCA AGTATTAAGA 

1670 16BO 
GTGTCTTCTT CTBCACGAGA 

1730 17^0 
CAAGTTTCCT TCTCTGGAGC 

1790 IBOO 
ATCACAGCCT GTT6AACTCT 

1 B50 1 860 

G6AAGATTCA GCCAAGCTCA 


TAC CCT CGG CCG CCB TCC AAG ACc' T^C 
Tyr Pro Arg Pro Pro Ger Lyi: Thr T-/.- 

l*--cO 

CGC GAA GT6 ATC CAG AAC CCG GG" CCC 
Arg Giu Va 1 lie Gin Asn Pro Giy Fro 

GGC GCC ACT TTG CTG CTG CTG CAG CAG 
Gly Ala Ber Leu Leu Leu Leu Gin Gin 

CAG CAG CAG CAG CAG CAG CAG CAG CAG 
Gin Gin Gin Gin Gin Gin Gin Gin GZn 

E160 

CAG CAG CAG GGT GAG GAT GGT TCT CCC 
Gin Bin Gin Gly Glu Asp Gly Ser Pro 

BTC CTG GAT GAG GAA CAG CAA CCT TCA 
Val Leu Asp Glu Glu Gin Bin Pro Ser 

££80 

AGA GGT TGC GTC CCA GAG CCT GGA GCC 
Arg Gly Cys Val Pro Glu Pro Gly Ala 

£340 

CTG CCA GCA CCT CCG GAC BAG GAT GAC 
Leu Pro Ala Pro Pro Asp Glu Asp Asp 


M i\ 1 l-f.r rCA It'f: APG TTG T Cr CTG nni CCC act TTC CCC GGC TTA AGC ABC TSC 

^•-•1 f'^U* TiJa F» r- B^^v Thr Le-ii Strr Leti t pvi G3y Pro 7hr Phe Pre* Bly Leu Ber Ser Cys 

K r nr^ f^Ar cn ama gap: atc c7g aol-: pai? gcc agc acc atg caa ctc ctt cag caa cag 

be.) p]^ r^^:^:. Lpm LyB Arp lie Leu Ser Glu Alp Ser Thr Met Gin Leu Leu Bin Bin Bin 

\ i r-r, i^Ar', Gt A Gtp rrc GAA GOL* ABC AGC AGC GGG AGA GCG AGG BAG GCC TCG BBG BCT 
GJn G3r. Glu A3a Vr 1 Ser B]u G3>' B(?r Ger Ser Gly Arg Ala Arg Glu Ala Ber Bly Ala 

eriMi 25B0 
rr r ATT ICC ICC AAG GAC AAT lAC T7A GGG GGC ACT TCG ACC ATT TCT BAC AAC GCC AAG 
Pir. 71, r *^C'r Scr L A'^p (i^r> Tyr LeLi Gly Gly Thr Ser Thr lie Ser Asp Asn Ala Lys 

v;;..', ;](■; T HI /^pf-, fVA f^TP. KG GIG 7CC A?G GGC CTG GGT BTG GAG GCG TTG GAG CAT CTG 
fl < l*-!' ( y^^ L % A^.- Vf«1 Ser Vf^l Sei Mrt Gly Leu Gly Val Glu Ala Leu Blu His Leu 

AG I rCA Gf>B GAA CAG CTT CGG GGG GAT TGC ATG TAC GCC CCA CTT TTG 6GA GTT CCA CCC 
Sr-^ L. Gly' Glu LOn Leu Arg Gly A?P CyF. He* Tyr Ala Pro Leu Leu Bly Val Pre Pro 

P7?- 2760 
r.i. ! GIG Gr.T C.rr r-iC"! CCl TGT GCC CCA 17G GCC BAA TGC AAA GGT TCT CTG CTA BAC BAC 
Air i\} Pi pro Thr F ro Cye Ala Pro Leu Ala Glu Cys Lys Gly Ser Leu Leu Asp Asp 

£79^ 2B20 

r -r. g;:a gbc A^r; agc act gaa gat act gct gag tat tcc cct ttc aag gga ggt tac acc 

Ajr Bly Lvp SDr Thr B5u Anp Ihr Ala Glu Tyr Ser Pro Phe Ly& Bly Gly Tyr Thr 

28B0 

A^'.A GGL- C-TA GAn GGL' GAG ABC CTA GGC TGC TCT GGC AGC GCT GCA GCA GGG ABC TCC GGG 
i.yr^ Glv Leif GJu Gi^ GJu Ser Leu Gly Cysi Ber Gly Ser Ala Ala Ala Bly Ser Ser Gly 

MA C Ti H/'.o TTG CCG T C"l ACC CTG TCT CTC TAC AAG TCC GGA GCA CTB BAC BAG GCA GCT 
Tp.i 1 r:' f'Wu Lpm F-p rer Thr Le-..( Ser ». eu Tyr LvB Ser G}y Ala Leu Asp Glu Ala Ala 

pc?7o 3000 
l'.r{\ TAC C AG AGT CGC GAC TAC TAC AAC TTT CCA CTG BCT CTG GCC GGA CCG CCG CCC CCT 
A3.- T>'r Gin 5^-r Ara A=p Tyr Tyr Asn Phe Pro Leu Ala Leu Ala Gly Pro Pro Pro Pro 

3030 3060 
CLi: CTG CCT CCC CAT CLC CAC GCT CGC ATC AAG CTG GAG AAC CCG CTG GAC TAC BBC ABC 
r-rr* Pro Ft^. Hi? Pre Hie Ala Arg lie Lye Leu Glu Asn Pro Leu Asp Tyr Gly Ber 

30'^ f\ 3120 
•>Ll tgg nCG GCT GTS GCG GCG CA<i^-TGC cQ-<^UT GGG GAC CTG GCG AGC CTB CAT GGC GCG 
Air- T. p Al? Ala Ab;A3a Ala G4a C ve Arqi^r Gly Asp Leu Ala Ser Leu His Gly Ala 

ai^r.A * 3180 

nPT GCA GCG GGA fCC GGi TCT GGG TCA CCL TCA GCC GCC GCT TCC TCA TCC TGG CAC ACT 
i:,]',,.- ^^1^ fi] p Gly Pre Gly- Ser Gly S^ft Pro Ser Ala Ala Ala Ser Ser Ser Trp His Thr 

Lie TTC ACA GCC GAA GAA GGC CAG TTG TAT GGA CCG TGT GGT GGT GGT GGG GGT BGT GGC 
Lr-u V'-he Tt>r Ala Glu Glu Gly G3^» Leu T > r Glv Fro Cys Gly Gly Gly Bly Bly Gly Bl> 


3H70 3300 
GGC BSC GGC BGC GGC 6GC BSC GGC GGC BBC GGC GGC BSC BBC GGC BBC GGC BAG BCG GGA 
Gly Gly Gly Gly Gly Gly Gly Gly Gly Gly Bly Gly Gly Gly Gly Gly Gly Blu Ala Gly 


3330 


3360 


GCT GTA GCC CCC TAG GGC TAB ACT CGG CCC CCT CAB GGB CTG SCB GGC CAB BAA ABC BAC 
Ala Val Ala Pro Tyr Sly Tyr Thr Arg Pro Pro Bin Sly Leu Ala Gly Bin Blu Ser Asp 

3390 3^eO 
TTC ACC GCA CCT GAT GTS TGG TAC Cdf GGC- GGC'"ATB GTG AGC AG A GTB CCC TAT CCC AGT 
Phe Thr Ala Pro Asp Val Trp Tyr Pro Gly Gly Met Val Ser Arg Val Pro Tyr Pro Ser 

3450 3z^a0 
CCC ACT TGT GTC AAA AGC GAA ATG GGC CCC TGG ATB BAT ABC TAC TCC CGG BAA CCT TAC 
Pro Thr Cys Val Lys Ser Blu Met Bly Pro Trp Met Asp Ser Tyr Ser Arg Blu Pro Tyr 

351 O 3540 
GGG BAC ATB CGT TTB BAG ACT GCC ABB BAC CAT GTT TTG CCC ATT BAC TAT TAC TTT CCA 
Gly Asp Met Arg Leu Blu Thr Ala Arg Asp His Val Leu Pro lie Asp Tyr Tyr Phe Pro 

3570 3600 
CCC CAB AAG ACCTGC CTG ATC TGT GGA BAT GAA GCT TCT BBS TGT CAC TAT GGA GCT CTC 
Pro Gin Lys Thr Cys Leu lie Cys Gly Asp Blu Ala Ser Bly Cys His Tyr Bly Ala Leu 

3<^30 3660 
ACA TBT GGA AGC TGC AAB ETC TTC TTC AAA AGA GCC GCT GAA GGG AAA CAB AAG TAC CTG 
Thr Cys Gly Ser Cys Lys Val Phe Phe Lys Arg Ala Ala Glu Bly Lys Bin Lys Tyr Leu 

3690 3720 
TGC GCC AGC ABA AAT GAT TGC ACT ATT BAT AAA TTC CBA ABB AAA AAT TBT CCA TCT TGT 
^^Cys Ala Ser Arg Asn Asp Cys Thr lie Asp Lys Phe Arg Arg Lys Asn Cys- Pro Ser Cys 

3750 3730 
CGT CTT CGG AAA TGT TAT GAA BCA BGG ATG ACT CTG GBA GCC CGG AAB CTG AAG AAA CTT 
Arg Leu Arg Lys Cys Tyr Blu Ala Gly Met Thr Leu Bly Ala Arg Lys Leu Lys Lys Leu 

3810 5S40 
Gt-iT AAT CTG AAA CTA CAG GAB BAA BBA BAG GCT TCC AGC ACC ACC ABC CCC ACT SAG GAG 
Gly Asn Leu Lys Leu Gin Glu Blu Gly Glu Ala Ser Ser Thr Thr Ser Pro Thr Glu Glu 

3B70 3900 
ACA ACC CAG AAG CTG ACA GTB TCA CAC ATT GAA GGC TAT GAA TGT CAG CCC ATC TTT CTG 
Tt.r Thr Gin Lys Leu Thr Val Ser His He Glu Gly Tyr Glu Cys Bin Pro He Phe Leu 

3930 3960 
AAT GTC CTG GAA GCC ATT GAG CCA GGT GTA BTB TGT 6CT BBA CAC BAC AAC AAC CAG CCC 
Asn Val Leu Glu Ala He Glu Pro Gly Val Val Cys Ala Gly His Asp Asn Asn Gin Pro 

3990 A030 
GAC TCC TTT GCA GCC TTG CTC TCT AGC CTC AAT GAA CTG GGA GAG AGA CAG CTT GTA CAC 
Asp Ser Phe Ala Ala Leu Leu Ser Ser Leu Asn Glu Leu Gly Glu Arg Gin Leu Val His 

4050 ^OBO 
GTG GTC AAG TGG GCC AAG GGC TTG CCT GGC TTC CGC AAC TTA CAC BTB BAC GAC CAG ATS 
V^sl Val Lys Trp Ala Lys Gly Leu Pro Gly Phe Arg Asn Leu His Val Asp Asp Gin Met 

4110 4140 
GCT GTC ATT CAG TAC TCC TGG ATG EGG CTC ATG GTG TTT GCC ATG GGC TGG CGA TCC TTC 
Ala Val He Gin Tyr Ser Trp Met Gly Leu Met Val Phe Ala Met Gly Trp Arg Ser Phe 


4170 4£00 
ACC AAT GTC AAC TCC AGS ATB CTC TAC TTC GCC CCT GAT CT6 GTT TTC AAT GAG TAC CGC 
Thr Asn Val Asn Ser Arg Met Leu Tyr Phe Ala Pro Asp Leu Val Phe Asn Glu Tyr Arg 

4a3o ^e^o 

ATG CAC AAG TCC CGG ATG TAC AGC CAG T6T GTC CGA ATG AGG CAC CTC TCT CAA GAG TTT 
Met His Lys Ser Arg Met Tyr Ser Gin Cys Val Arg Met Arg His Leu Ser Gin Glu Phe 

4e90 . - - 4320 

GGA TGB CTC CAA ATC ACC CCC CAB GAA TTC CTG TGC ATG AAA GCA CTG CTA CTC' TTC AGC 
Gly Trp Leu Gin lie Thr Pro Gin Glu Phe Leu Cys Met Lys Ala Leu Leu Leu Phe Ser 

4350 i^^Qo 
ATT ATT CCA GTG GAT GGG CTG AAA AAT CAA AAA TTC TTT GAT GAA CTT CGA ATG AAC TAC 
He He Pro Val Asp Gly Leu Lys Asn Gin Lys Phe Phe Asp Glu Leu Arg Met Asn Tyr 

4410 4440 
ATC AAG GAA CTC GAT CGT ATC ATT GCA TGC AAA AGA AAA AAT CCC ACA TCC TGC TCA ABA 
He Lys Glu Leu Asp Arg He He Ala Cys Lys Arg Lys Asn Pro Thr Ser Cys Ser Arg 

4470 450f> 
CGC TTC TAC CAG CTC ACC AAG CTC CTG GAC TCC GTG CAG CCT ATT GCG AGA GAG CTG CAT 
Arg Phe Tyr Gin Leu Thr Lys Leu Leu Asp Ber Val Gin Pro He Ala Arg Glu Leu His 

4530 £^^i,c> 
CAG TTC ACT TTT GAC CTG CTA ATC AAG TCA CAC ATG GTG AGC GTG GAC TTT CCG GAA ATG 
Gin Phe Thr Phe Asp Leu Leu He Lys Ser His Met Val Ser Val Asp Phe Pro Glu Met 

4590 . t^sRO 

ATB GCA GAG ATC ATC TCT GTG CAA GTG CCC AAG ATC CTT TCT GGG AAA GTC AAG CCC ATC 
^ Met Ala Glu He He Ser Val Gin Val Pro Lys He Leu Ser Gly Lys Val Lys Pro He 

4650 46Sn 
TAT TTC CAC ACC CAG TGA AGC ATT GGA AAC CCT ATT TCC CCA CCC CAG CTC ATB CCC CCT 
Tyr Phe His Thr Gin End , 

4690 4700 4710 47PO 4730 4740 

TTCAGATGTC TTCTGCCTGT TATAACTCTG CACTACTCCT CTGCAGTGCC TTGGGGAATT 

4750 4760 4770 47B0 4790 480O 

TCCTCTATTG ATGTACAGTC TGTCATGAAC ATGTTCCTGA ATTCTATTTB CTGBGCTTTT 

4810 4aa0 4B30 4840 4850 4860 

TTTTTCTCTT TCTCTCCTTT CTTTTTCTTC TTCCCTCCCT ATCTAACCCT CCCATG6CAC 

4S70 4BSO 4S90 4900 4910 49eO 

CTTCAGACTT TGCTTCCCAT TGTGGCTCCT ATCTGTGTTT T6AATGGTGT TGTAT6CCTT 

4930 4940 4950 4960 4970 49SO 

TAAATCTGTG ATGATCCTCA TATGGCCCAG TGTCAAGTTG TGCTTGTTTA CAGCACTACT 

4990 5000 5010 50P0 5030 5040 

CTGTGCCASC CACACAAACB TTTACTTATC TTATBCCACG GGAAGTTTA6 AGAGCTAAGA 

5050 5060 5070 5080 

TTATCTGGGG AAATCAAAAC AAAAAACAAG CAAACAAAAA AAAAA 
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&0 100 110 leo 


1-'^"'' 1-+C If^.O 170 180 

I'CC ^C-3AGMTCrrTAGGA:-Cr.AL CCTnC~bGiHA5nArC AGAGGGTCCGGAFCAAACC^G 


3^0 £00 aio Ear-o aso s^o 

G*^rrCTi3^;,QAl^GGrATCAGAT.GGGAAr^AGACTGAGTTAf?:CCACT2CAGTBrCATACAGAA 

?w'0 FoO EBO P9'') 300 

GCTTAAGGGACAT-^CCAC3r-AGLCCCAGLrCAnCGAC:AGCCAACGCCT3TTG::AGAGCG 


^'iO 32("^ 330 ShO 350 360 

GCGGCTTClG^AGCCGCCGCCC^GAAG.rTG-ZCCTTTCCTCTTCGGTGA^GTTTCTAAAAGC 


3^0 33'') 370 ^00 AIO 480 

TGCGGr.AGACTCGGr-.GGAAGIGAAGAAAGTGTCCGGTAGGACTACGACTDCCTTTGTCCT 


-1 3 L^i^''^ ^5 'J '^^'-j ^8'"> 

CCITrL^TrcTACGCCTACrrrTCCTGEnTrccrTC-^LCrTGAGrGGACTAGGCAGGC TTG 


■'■i ■ j 5 ] O r. E f " J 53 'J 5 L\ o 

CTGGrCA3LL"CTCTCCCCTACACCACCAGCTCTGCCAGCCAGTTTGrA!:AGi:<GGl AACTl 


i 5d)0 5"r> 53.1) f:ic;-o AO'!) 

CrT-^TGGCTQAAAGCAGACSAGCTTGTTGCrCATTGGAAGGGAGGGTTTTbGGAETi7CAG 


6H0 .~3'- ^co 6^.-0 A,60 

AbALTPAGGAGCAAr.AGC^^CGCTGGAGASTCCCTGATTCCAGGTTCTCCCCCCTGGACCT 


^^■'^'*'' 6G0 o-=?0 70:; -7-^0 730 

ri-ACTGCC rr- ,ircLTCArLCTGT5"T"GTGCAGrTAGAATTGAAAAGAT3(-.AAAGAr^-GTT 


•y-O ^-'-O ^'AOj 7^0 '7S0 

&GC-r'LTTc.-3TAGTCPAMAG"AAwf~rAAAf-Gr aAaA*^G i-i'^AATAGCCCA 

300 V^.l-r-y y^O S''3> ' S^.O 

b ' '^C"-'> -TT0rACCTGrTTrAGTG3AC-TTGACTrTGG^.^GCCAGA3AA T1 TCCTTrr 

StC G'^O SPj b'^0 ^^OO 

CrCCHCTrA^GrTTTGAGLATCrTTT^^ATCTCT-^CT TC Ah-G-ATTTAGGGAC-iAACTGTB 


■■^'p' > CT'SO shO PAr. 

A^A:T^rr:A2r-r,CAG^^TrCTL-CTASrCvIGT3CrTTrrTTTHZA& 

J:*];'!! _ __ ^''-"''''^ lOlo io£o 

^O^rj lO-iv^ iO'^O 1030 

J 1 i i 1 0 n ~'0 1 1 30 1 1 cyn 

CTr-.6-Gf..^i:G-3 i C t AL-L'CALb-(?ICCCrHTCCAA3ACrTATCGA6&ABCSTTCCABAATCTG 
LeuGi^/AraV.-aTvrPrc'HrgPrc.PrDSerLvEThrTvrA-gGlvAlaiPnEL-Un 

1 1230 l^^-O lEZ'O u^60 

TTCL^bAGi..GT6CbCGAAGCGATCCAGAACCCGGGCCCCASGrACC 

P h 6-G 1 -'-iS - r n j li A la-] '* eG 1 rt A b n f- r o G ] y p r o A r n s~! i s P d G } u A 1 a A 1 a E>t?r 

12-70 ^ 12130 3 2^0 1300 13l0 IS^n 
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i^'*^'-' _ 2^00 ^ 1^410 IHE'J 3h3:' 1^40 

LTGGiirjCTGGAGbAGGAALAr-LAGCCTTCr^GAGCAGCAGTC AGCCTrCGAC-GGC 
Lr^uAl ^'Le:(LUutrli.biuGlnLUr,ProSer&lnGlnGln&erA laBerGluG.! vHisPro 

1460 j h70 1 -+8f"- 1 ^^'H 15.-,ri 

GAGAGrjGCXTGLLTLCCGGAGrCTGGMGCTGCCACGGCTCCTGGC^AGGGGGTGCCGCAG 
GIuv:?:--'-Gj vCvEL&uKroGiiiProGl v'Alr'AleThrAIaP-oGl s-LysGivLeuP-'-oG] r. 

_ :!^so 1330 i5.^-o j^fi^o 15^30 

uAiT^LCy^r CAbL-TuCTL 'L:AL-.ATCAGGATGACTCAGr7GCGCCA1 L CACGTTGTC C CT ACTG 
ul :->rrcPr.:.A:aP>-DFrc.AE:i.GlnrHE3AE:n5erAlaAieP--c.SBrTh-~Le...G£?rLe^^^ 
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- " 1 I 5':- 1 .T : •:• 3 1 6.no 
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„ 

f^AG^^L-^AbC'"-'j:rAbuA--.CA5C AGCAGGAGGTAA"! TG 

Gi .-.b : .-Gi : -.L-.1 i-i5 j r,Gl I'-r^} ,-iG:..Vi:>l } I r^G-^.-Gl oG 1 ySe -Se--^.?^r V^- 1 Arn 

f^-'^"' _ ^'-i:^ r"S;o : : vo i^.^n 

GLAAGGbAGGCL A'.Ti:.Rr.c;-j,3,r;f-;-rTTL ■ T ACCT kGGGGCC.-iA7 TCG 

AIaA-ci...:^iMAiaTn.-Gi v'A 1 aP-r. ^-.~'::.e rSc Lv-h Asp r T vrL euG f v G ] /A- nG^r 


3 1 0 L 3£ ? 1 S30 : 9^ O 1 85'.: 1 3 

ALrr^TATlTGACAG~r-:CrAA5&AGT7bTGTAAAGCAi? 

i 3 1 & & 'J 1 8 X c ' >:) i i Cj i a o 

GT b 'I' A A A C T 6& A r-i C AT C T b A 5TC C A B i3S G A .3c T T C S-I^ 5G DG A L T GC f .-T G T AC S CG 
'-'£■< J b I ut^- 1 ?,L_enG 1 uHi HLeu'£.t?rPrr,i3 1 --Gi uG Inl euAr cG? yAspC --'e "le t TyrAJ a 

I ? * 1 '^'-VO 1 ^'SC. ■"•£> :) 5 vT'O 3 950 

'rCCCTC'ITGEr-'.-QGTCCACCCGCCGTGCGI CCC ACTlCTTGTGlGCCTCTGGCCGAATGC 
^erL^uL^.tGl /G] \ F rof- ro A 1 a^^'a 1 Ai-qPrcThrProC' sAiaProi-euAl aGluCys 

i ^'^'''^ EO'I'-O 5'M G POEO H0?0 £0'40 
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L VE G 1 ' Lb' iGerLeit A^pG i uG 3 y ProG I yL v = G : --'T hrGi i-b 1 uTh r A 1 c^G July r St^r 

£''.I'5C' £060 £'r>70 POSO EO*='0 G100 

TCT""Trr-.A&GGAGGTTACGCCAAAGGGTTGGAAGGTGAGAGTCTGGGLTGCTCTGGGAGC 
S e Pr, / c G i v'G -1 vTy T A 1 r L v } v Le^u G 1 u G 1 v/G 1 u =.cr r LeuG 1 y C v sSe r G j y St^r 

S 3 1 Ci E* 1 EG 2 3^'i 2 1 -^v'.J £ 1 50 a 1 60 

AGTGAAGC AGGTAGr:Ti:TGGGACArTTGA&ATCCCGTCCTCA::TGTCTCTGTATAAGTCT 
GE-r GIuh: rGl v'G'£'r5e^-G 1 vTh>" LeuG 1 u T i eFrcGer GerLeLiGer'_e'_'T yrLx-nGBr 

^1"0 £190 £1^0 EEOO R^iO £EPO 

f CAGCA:rT.-^GACGAGGCA6CAGCAT-CCA3AHTGG::GALTArTfr^CAACTTTGLn:nT 
G J '-'H 1 -> v'c, i P/FpG J uA i ^ ^- i a A] aTyrGl nA&nArQAsffTy .- yvr As nPhePr oLe-u Al a 

EG'^r. GE^O £260 HH"^ > EESO 

CTGTC LGGGCCGGGGrACCCCCCGCCCCCTACGCATCCAC^^CGC^CGLATrAAGGTGGAG 
LE..'S'-*rGl%i='. -PreHz EProProProPrrThr Ml Ep^o^^'iEAl a Arg I le^LvELEuGlu 

_SP'--0 £31)0 £3 10 ^5£0 £330 E340 

'^'^-'-5-'^'"^'~^^'3ACTACGGCAGCGCCTGGGCTGCGGlGGCAGCGGAATGCGGCTA1 GGGGAG 
As:. rt-Tr.GerAspT'.TGJ vL-pr A] aTrp Al aA 1 aAl af^t ] r*Al £ Gl nCvsArgTyrBl yA=p 

EST'"' E3.30 £3'^0 £3E0 £390 £^00 

TTGGCTAGCr.TAGATGGAGGGAGTGTAGrCGGACGCAGGArTGGATrGCCCCCAGCCACC 
L eu A I aSe r L E L' H 1 sG 1 y G i \ Se? r v'a 1 A 1 aG 2 yPr d Se r Th r G 1 ^ -Ser Pr c <=* r o A 1 r- Th r 

P-i O E^PO £^30 E^hO £^10 E'^oO 

br.„.TrTTLTTrrTG6CATArTrTCTTCAGAGGTGAA6AAGGCCAATTATA7GGG::CAGGA 
Ala'-^-M-'3.-.-Scr-- I rpHisThrLc-uPne"hrAiaGluGlL!Gl --'G 1 nLt?u~yr&lv PrcGl / 

„„ '^^^''^ E-^'-O £500 £51 £^-£0 

i .ir.L.".':- ,j.3l cz-L AGT AGT C rAAGGGATGGTGGGGr^G" (^GCCCCGl ATGGCTACACT 

b 1 V CO '.'G ] ■-' C i b e - Ger Ge r Ft o r t r A h p A 1 a 3 i ^ ' H r c ^-^a j A 1 a r* r « T , ^- G 1 v T */ r Th r 

__^^''"'0 Er-fO E'550 £S-r?'j PvVO £530 

C G *: L L G r AGGGGf T GG L A AGCC AGG-GGG i r T "i G "r r TGG G TG T '3-^- l " 3 T33T A T 
Ar 'jPrr I 'r ! .-;G1 v L <-u A i abcrr Ai nGl uGI yAcpPht. G.'?r Ai ^.btS-G i ■..-'^ 1 frpTv - 

3600 £610 EoEO E.r.Sr; E-^^C 

PC i bGTGGAG J VG ) br-AGA8AGTrrcCTATCCCAGTCCCAnrTGTGTTA''W' AGVGAAATG 
PrcG v-r- ' s'V:? j Va 1 r-iz cVa i f-'roTyrPrDSc-r Prc-Se -G-> eV--*] L vs£3e--G Ui'~le- 1 
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Figure 7 . Frozen section of rat ventral prostate stained with antibodies 
(AR-52-3-P) to the AR: peptide NH2-Asp-His-Val-Leu-Pro-IIe-Asp-Tyr-Tyr- 
Phe-Pro-Pro-Gln-Lys-Thr in a dilution of 1 to 3000 using the avidin-bioiin 
peroxidase technique. Androgen receptor is indicated by brown staining of nuclei 
in epithelial cells. Immuno-staining was performed as previously described (60). 
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